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THREE NOVEL GENES ENCODING A ZINC FINGER PROTEIN, A GUANINE, NUCLEOTIDE EXCHANGE FACTOR 
AND A HEAT SHOCK PROTEIN OR HEAT SHOCK BINDING PROTEIN 

FIELD OF THE INVENTION 

5 The present invention relates generally to a novel human gene and its derivatives and to 
mammalian, animal, insect, nematodes, avian and microbial homologues thereof. The present 
invention further provides pharmaceutical compositions and diagnostic agents as well as genetic 
molecules useful in gene replacement therapy and recombinant molecules useful in protein 
replacement therapy. 



BACKGROUND OF THE INVENTION 

Bibliographic details of the publications referred to by author in this specification are collected 
at the end of the description. 



The increasing sophistication of recombinant DNA technology is greatly facilitating research and 
development in the medical and allied health fields. There is growing need to develop 
recombinant and genetic molecules for use in diagnosis and in conventional pharmaceutical 
preparations as well as in gene and protein replacement therapies. 



In work leading up to the present invention, the inventors sought to identify and clone human 
genes which might be useful as potential diagnostic and/or therapeutic agents. Molecules of 
particular interest targeted by the inventors were gene regulators including regulatory proteins, 
signal transducters and heat shock proteins. 



Gene expression generally requires interaction between a regulatory protein and an appropriate 
recognition sequence of a target gene. Regulatory proteins comprise in many cases a domain or 
motif which facilitates binding to DNA. One particular motif comprises small sequence units 
repeated in tandem with each unit folded about a zinc atom to form separate structural domains. 
30 This motif is now referred to as a zinc finger domain. Such a domain is generally defined by the 
number of cysteine (C) and histidine (H) residues. 
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In addition, knowledge of cellular interaction in the control of cell proliferation is essential in the 
rational design of specific therapeutic strategies aimed at controlling proliferative disorders. 
Such proliferative disorders including a range of cancers, inflammatory conditions and 
atherosclerosis. An iirportant aspect of cellular interaction is in signal transduction via receptors 
5 to intracellular transducers. One key signal transducer is Ras which couples the receptors for 
diverse extracellular signals to different effectors. Ras directly activates the downstream kinase 
Raf which in turn induces the mitogen activated protein kinase (MAPK) cascade. 

Another regulatory mechanism involves heat shock proteins. The Escherichia coli heat shock 
10 protein, DnaJ, is the founding member of a family of proteins which are associated with protein 
folding, protein complex assembly and transit through subcellular components, 

Prokaryotic and eukaryotic DnaJ homologues have a modular organisation consisting of a J 
domain, a glycine-rich spacer, CXXCXGXG [SEQ ID NO:l] repeats and a C-terminal region 
15 with no obvious sequence features, as well as additional sequences for protein targeting. The 
J domain is anticipated to mediate interaction with heat shock 70 proteins (Hsp70) and consists 
of some 70 amino acids, frequently located at the N-terminus of the protein. 

In accordance with the present invention, a genes have been identified from the human genome 
20 which encodes proteins having a regulatory role. One gene, in accordance with the present 
invention encodes a protein with an N-terminal region resembling a zinc-finger domain of a novel 
type. Another gene encodes a protein involved in guanine nucleotide exchange factor (GEF) 
signalling pathways. Yet another gene encodes a protein which is a heat shock protein or heat 
shock-like protein which may have a role in tumour suppression. 

25 

SUMMARY OF THE INVENTION 

Throughout this specification, unless the context requires otherwise, the word "comprise", or 
variations such as "comprises" or "comprising", will be understood to imply the inclusion of a 
30 stated element or integer or group of elements or integers but not the exclusion of any other 
element or integer or group of elements or integers. 
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Sequence identity numbers (SEQ ID NOs.) for nucleotide and amino acid sequences referred to 
in the subject specification are defined after the bibliography. A summary of SEQ ED NOs. is 
also given in Table L 

5 One aspect of the present invention contemplates an isolated nucleic acid molecule comprising 
a sequence of nucleotides encoding or complementary to a sequence encoding an amino acid 
sequence having homology to a regulator of gene expression or a derivative of said gene 
regulator. 

10 Another aspect of the present invention provides an isolated nucleic acid molecule comprising 
a sequence of nucleotides encoding or complementary to a sequence encoding a regulator of 
gene expression wherein said regulator comprises a zinc finger domain of an (HC3)2 type. 

Yet another aspect of the present invention is directed to an isolated nucleic acid molecule 
15 comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:2; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 3; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
20 of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions at 42°C 
to the nucleotide sequence set forth in (i), (ii) or (iii). 

The nucleotide sequence set forth in SEQ ID NO:2 defines the gene, mcg4. This gene encodes 
25 a product, MCG4, having an amino acid sequence set forth in SEQ ID NO: 3. 

Even yet another aspect of the present invention provides a genetic construct comprising a vector 
portion and an animal, more particularly a mammalian and even more particularly a human mcg4 
gene portion, which mcg4 gene portion is capable of encoding an MCG4 polypeptide or a 
30 functional or immunologically interactive derivative thereof. 
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Still yet another aspect of the present invention contemplates a method of detecting a condition 
caused or facilitated by an aberration in mcg4, said method comprising determining the presence 
of a single or multiple nucleotide substitution, deletion and/or addition or other aberration to one 
or both alleles of said mcg4 wherein the presence of such a nucleotide substitution, deletion 
5 and/or addition or other aberration may be indicative of said condition or a propensity to develop 
said condition. 

Even still a further aspect of the present invention relates to a method of detecting a condition 
caused or facilitated by an aberration in mcg4, said method comprising screening for a single or 
10 multiple amino acid substitution, deletion and/or addition to MCG4 wherein the presence of such 
a mutation is indicative of or a propensity to develop said condition. 

Another aspect of the present invention contemplates a method for detecting MCG4 or a 
derivative thereof in a biological sample said method comprising contacting said biological 
15 san^le with an antibody specific for MCG4 or its derivatives or homologues for a time and under 
conditions sufficient for an antibody-MCG4 complex to form, and then detecting said complex, 

A further aspect of the present invention contemplates an isolated nucleic acid molecule 
comprising a sequence of nucleotides encoding or complementary to a sequence encoding an 
20 ammo acid sequence having homology to a guanine nucleotide exchange factor (GEF) or a 
derivative thereof. 

Yet another aspect of the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

25 

(i) a nucleotide sequence set forth in SEQ ID NO:4 or 6; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:5 
or 7; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
30 of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions to the 
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nucleotide sequence set forth in (i), (ii) or (iii). 



The nucleotide sequence set forth in SEQ ID NO:4 or 6 defines the gene, mcg7. This gene 
encodes a product, MCG7, having an amino acid sequence set forth in SEQ ID NO:5 or 7. 



Even yet another aspect of the present invention provides a genetic construct comprising a vector 
portion and an animal, more particularly a mammalian and even more particularly a human mcg7 
gene portion, which mcg7 gene portion is capable of encoding an MCG7 polypeptide or a 
functional or immunologically interactive derivative thereof. 



Still yet another aspect of the present invention contemplates a method of detecting a condition 
caused or facilitated by an aberration in meg?, said method comprising determining the presence 
of a single or multiple nucleotide substitution, deletion and/or addition or other aberration to one 
or both alleles of said mcg7 wherein the presence of such a nucleotide substitution, deletion 
1 5 and/or addition or other aberration may be indicative of said condition or a propensity to develop 
said condition. 

Even still a further aspect of the present invention relates to a method of detecting a condition 
caused or facilitated by an aberration in meg 7, said method comprising screening for a single or 
20 multiple amino acid substitution, deletion and/or addition to MCG7 wherein the presence of such 
a mutation is indicative of or a propensity to develop said condition. 

Another aspect of the present invention contemplates a method for detecting MCG7 or a 
derivative thereof in a biological sample said method comprising contacting said biological 
25 sanple with an antibody specific for MCG7 or its derivatives or honx)logues for a time and under 
conditions sufficient for an antibody-MCG7 complex to form, and then detecting said complex. 

Yet another aspect of the present invention contemplates an isolat;:;d nucleic acid molecule 
comprising a sequence of nucleotides encoding or complementary to a sequence encoding an 
30 amino acid sequence having homology to a heat shock protein or a heat shock binding protein 
or a derivative thereof. 
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Another aspect of the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:8; 

5 (ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:9; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions at 4 TC 
to the nucleotide sequence set forth in (i), (ii) or (iii). 

10 

The nucleotide sequence set forth in SEQ ID NO: 8 defines the gene, meg J 8. This gene encodes 
a product, MCG18, having an amino acid sequence set forth in SEQ ID NO:7. 

Even yet another aspect of the present invention provides a genetic construct comprising a vector 
15 portion and an animal, more particularly a mammalian and even more particularly a human 
mcgl8 gene portion, which meg 18 gene portion is capable of encoding an MCG18 polypeptide 
or a functional or immunologically interactive derivative thereof. 

Still yet another aspect of the present invention contemplates a method of detecting a condition 
20 caused or facilitated by an aberration in megIS, said method coirprising determining the presence 
of a single or multiple nucleotide substitution, deletion and/or addition or other aberration to one 
or both alleles of said mcgl8 wherein the presence of such a nucleotide substitution, deletion 
and/or addition or other aberration may be indicative of said condition or a propensity to develop 
said condition. 

25 

Even still a further aspect of the present invention relates to a method of detecting a condition 
caused or facilitated by an aberration in meg 18, said method comprising screening for a single 
or multiple amino acid substitution, deletion and/or addition to MCG18 wherein the presence of 
such a mutation is indicative of or a propensity to develop said condition. 

30 

Another aspect of the present invention contemplates a method for detecting MCG18 or a 
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derivative thereof in a biological sample said method comprising contacting said biological 
sample with an antibody specific for MCG18 or its derivatives or homologues for a time and 
under conditions sufficient for an antibody-MCG 1 8 complex to form, and then detecting said 
complex. 

A summary of SEQ ID Nos. referred to in the subject specification is shown in Table 1. 
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TABLE 1 
SUMMARY OF SEQ ID Nos. 



5 SEQ ID NO. DESCRIPTION 



1 


amino acid repeat sequence in DnaJ homologues 


2 


Nucleotide sequence of mcg4 


3 


amino acid sequence of MCG4 


4 


nucleotide sequence of mcg7 


5 


amino acid sequence of MCG7 


6 


nucleotide sequence of mcgJ within exon of 




nucleotides 183-288 


7 


ammo acid sequence of MCG7 withm exon of 




nucieotlue loo-Zoo 


Q 

o 


nucleotide sequence of mcglS 


9 


amino acid sequence of MCG18 


10-18 


amino acid sequence identified using BESTH T 


19 


sequence of pGEX and meg 7 junction 


20 


sequence of pGEX and mcgZ junction 


21 


nucleotide sequence of myc-tag/mcg7 junction 


22 


amino acid sequence corresponding to SEQ ID NO: 21 


23 


nucleotide sequence of pGEX and mcg7 junction 


24 


amino acid sequence corresponding to SEQ ED NO:23 


25-36 


meg 7- specific oligonucleotide 


37-45 


mcg7S-specific oligonucleotide 



25 Single and three letter abbreviations for amino acid residues are shown in Table 2. 
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5 



/MTuno aVCIU 


1 nree- Idler 


One-letter 




/VDoreviauon 


^ymtx)l 


Alanine 


Ala 


A 


ATgininc 


Arg 


K 


Asparagine 


A . 

Asn 


N 


Aspartic acid 


A — — . 

Asp 


D 


Cysteine 


Cys 


C 


oiuiamine 


oin 


Q 


Glutamic acid 


Olu 


E 


oiycine 


vjiy 


Cjr 


rlistioine 


His 


H 


Isoleucine 


De 


I 


Leucine 


Leu 


L 


Lysine 


Lys 


K 


Nfethionine 


iviei 


M 


Phenylalanine 


rne 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 


Any residue 


Xaa 


X 



30 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a representation of the nucleotide sequence [SEQ ID NO: 2] and corresponding 
amino acid sequence [SEQ ID NO;3] of mcg4, 

5 

Figure 2 is a representation of the alignment of the human MCG4 amino acid sequence with a 
translation of a partial murine expressed sequence tag (EST). 

Figure 3 is a representation of the alignment of the human MCG4 amino acid sequence with a 
10 translation of a partial nematode EST. 

Figure 4 is a diagrammatic representation showing a predicted structure of MCG4 where H and 
C represent histidinc and cysteine residues, respectively and X refers to any amino acid residue, 
Zn represent zinc atoms. 

15 

Figure 5 is a representation of sensitive sequence homology search of related cysteine-containing 
motifs in another Caenorhabditis elegans protein. 

Figure 6 is a representation showing that a related cysteine containing motif is present in the 
20 GATA-binding transcription factor from Saccharomyces pombe. 

Figure 7 is a Northern blot showing expression of mcg4 in various cultured human cancer cell 
lines. Lanes 1-5, respectively, represent the hybridization signal from 15/^g total RNA derived 
from various human cancer cell lines. Lanes 1-5, respectively, contain RNA from H69 lung 
25 carcinoma cells, JAM ovary carcinoma cells, BT20 breast carcinoma cells, HaCat transformed 
keratinocytes, T24 bladder carcinoma cells. 

Figure 8 is a representation of a partial alignment of mcg4 with human ESTs AA074703 and 
AA134788. 

30 

Figure 9 is a representation of the partial nucleotide sequence alignment between a human 
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(W32939) and mouse (AA242159) mcg4Aikc EST in the putative 5' UTR of the mcg4 cDNA. 
The putative initiation codon is underlined and the region upstream represents 5 ' UTR. 

Figure 10 is a representation showing Mac Vector alignment of MCG4 with forward translations 
5 of ESTs AA134788 and AA074703. The nucleotide sequences are shown in Figure 8. 

Figure 11 is a diagrammatic representation of the domains of MCG4 

zinc finger consensus: CX2HX4CX2CX4HX2CX,7CX2CX,8HX2CX,8CX2C 
acidic domain consensus: 9/34 amino acids negatively charged, 0/34 positively charged 
10 basic domain consensus: 13/55 amino acids positively charged, 0/55 negatively charged 
leucine zipper domain consensus: LXgLX^RX^LX^L 

alternate "novel" leucine zipper-like motif where leucine would not be aligned along the one 
surface of an alpha helix domain: (aa261) LXgLXLX^LXLX^L (aa 286). 

15 Figure 12 is a representation showing similarity of MCG7 with GEFs of various organisms. 

Figure 13(a) is a representation of the nucleotide sequence [SEQ ID NO:4] and corresponding 
amino acid sequence [SEQ ID NO:5] of meg 7. Nucleotides 183-288 are an altemative spliced 
exon (shown in lower case). 

20 

Figure 13(b) is a representation of the partial nucleotide sequence [SEQ ID NO: 6] and 
corresponding amino acid sequence [SEQ ID NO:7] of mcg7 but without the exon shown in Fig. 
13(a). Amino acids have been numbered from the first methionine codon (underlined). The 
cDNA molecules of Fig. 13(a) and Fig. 13(b) differ by the inclusion and exclusion of the exon 
25 of nucleotides 183-288. 

Figure 14 is a representation showing a comparison between MCG7 and a homologue from 
Caenorhabditis elegans using the BESTFTT algorithm, in the figure, the following sequences 
are underlined: 

30 

EF-Hand= PROSITE DATABASE NO. PD0C00018 
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la nematode 



DVDEEDEVEDIEF [SEQ ID NO: 10] 
DVDGDGHISQEEF [SEQ ID NOil 1] 
DHDRDGFISQEEF [SEQ ID NO: 12] 
DQNQDGCISREEM [SEQ ID NO: 13] 



lb human 



nematode 



Ic human 



5 nematode DVDMDGQISKDEL [SEQ ED NO: 14] 

GUANINE NT BINDING REGION = BLOCKS DATABASE NO. BL00720B 



DaG-PE BINDING DOMAIN = PROSITE DATABASE NO. PD0C00379 

3 human HNFQESNSLRPVACRHCKALILGIYKQGLKCRACGVNCHKQCKDRl^X^ 
[SEQ ID NO: 17] 

15 nematode HNFHETTFLTPTTTCNHCNKLLWGILRQGFKCKDCGLAVHSCCKSNAVAEC 
[SEQ ID NO: 18] 

Figure 15 is a representation of an alignment of human and a partial (5' UTR and partial coding 
sequence) murine mcgl cDNA (GenBank Acc. No. W71787 and AA237373). The putative 
20 initiation codon is underlined. The murine sequence represents a composite of 2 partial cDNA 
sequences from the EST database (accession numbers W71787 and AA237373). Nucleotide 
differences between human and murine sequences are shown in lower case lettering and identical 
residues are indicated with asterisks. 

25 Figure 16 is a representation of ftirther 5' nucleotide and corresponding amino acid sequence for 
human mcgl. Nucleotide positions 1-321 were derived from GenBank Acc. No. AC0(X)134 and 
nucleotides 322 onwards from Fig. 13(a). Two in-frame initiation codons are underlined. 
Asterisks denote in-frame stop codons. 

30 Figure 17 is a graphical representation of a GDP release assay. □ Experiment #1 (mean of 
duplicates). 0 Experiment #2 (mean of duplicates). The exchange reaction contained 36pmols 



2 human 



HFVHVAEKlJ^I^NFmiJVL\W^ [SEQ ID NO: 15] 

KFVHVAKHLRKINNFNTLMSVVGGITHSSVARLAKTY 



nematode 



10 



[SEQ ID NO: 16] 
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of GST-MCG (N-terminally truncated; encoded by Construct B in Fig. 18) and 1.6-12.8 pmols 
of recombinant GST-N-Ras.GDP. Reaction time 6 mins. 
Estimated reaction constants: 

K„ = 2.1^lM, = 37pMol/6min/36pMol [Expt#l] 
5 K„ = 1.5mM, = 30,3pMol/6 min/36pMol [Expt#2] 

Figure 18 depicts various recombinant plasmids containing partial or full-length mcgZ 

Figure 19 is a representation of the nucleotide sequence [SEQ ID NO:8] and corresponding 
10 amino acid sequence [SEQ ID NO:9] of mcgl8. 

Figure 20 is a representation showing that MCG18 has partial homology to E. coli DnaJ. 

Figure 21 is a representation showing that MCG18 has homology to two Caenorhabitis elegans 
15 proteins. 

Figure 22 is a representation showing that MCG18 has homology to a Saccharomyces pombe 
protein. 

20 Figure 23 is a representation showing homology of MCG 1 8 to a Drosophila virilis protein. 

Figure 24 is a representation showing homology of MCG 18 to human DnaJ proteins HDJ- 
2/HSDJ, HDJ-1/HSP40 and HSJl. 

25 Figure 25 is a representation of the nucleotide and corresponding amino acid sequence of murine 
mcgl8. 

Figure 26 is a representation of homology between human and murine MCG 18. 

30 Figure 27 depicts nucleotide sequences corresponding to the 5' untranslated region of human 
mcgl8. 
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Figure 28 depicts a Northern blot showing expression of meg J 8 transcripts in total RNA isolated 
from various human cancer cell lines grown in culture. Lanes 1-5 respectively contain I5^g 
RNA from H69 lung carcinoma cells, JAM ovary carcinoma cells, BT20 breast carcinoma cells, 
HaCat transformed keratinocytes, T24 bladder carcinoma cells. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

The present invention provides an isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding or conplementary to a sequence encoding an amino acid sequence having 
5 homology to a regulator of gene expression or a derivative of said gene regulator. 

More particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides encoding or complementary to a sequence encoding a 
regulator of gene expression wherein said regulator comprises a zinc finger domain of an (HC3)2 
10 type. 

Still more particularly, the present invention provides an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 



15 (i) a nucleotide sequence set forth in SEQ ID NO:2; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 3; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions at 42°C 
20 to the nucleotide sequence set forth in (i), (ii) or (iii). 



The present invention also provides an isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding or con^lementary to a sequence encoding an amino acid sequence having 
homology to a guanine nucleotide exchange factor (GEF) or a derivative thereof. 

25 

More particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:4 or 6; 

30 (ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 5 

or 7; 
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(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions at 42**C 
to the nucleotide sequence set forth in (i), (ii) or (iii). 

5 

Another aspect of the present invention contemplates an isolated nucleic acid molecule 
comprising a sequence of nucleotides encoding or complementary to a sequence encoding an 
amino acid sequence having homology to a heat shock protein or a heat shock-binding protein 
or a derivative thereof. 

10 

More particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:8; 

15 (ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:9; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions at 42**C 
to the nucleotide sequence set forth in (i), (ii) or (iii). 

20 

Preferably, the percentage similarity is at least about 50%. More preferably, the percentage 
similarity is at least about 60%. 

Reference herein to a low stringency at 42°C includes and encompasses from at least about 1% 
25 v/v to at least about 15% v/v formamide and from at least about IM to at least about 2M salt for 
hybridisation, and at least about IM to at least about 2M salt for washing conditions. Alternative 
stringency conditions may be applied where necessary, such as medium stringency, which 
includes and encompasses from at least about 16% v/v to at least about 30% v/v formamide and 
from at least about 0.5M to at least about 0.9M salt for hybridisation, and at least about 0,5M 
30 to at least about 0.9M salt for washing conditions, or high stringency, which includes and 
enconq)asses from at least about 31% v/v to at least about 50% v/v formamide and from at least 
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about 0.0 IM to at least about 0. 15M salt for hybridisation, and at least about 0.0 IM to at least 
about 0. 15M salt for washing conditions. 

The term "similarity" as used herein includes exact identity between compared sequences at the 
5 nucleotide or amino acid level. Where there is non-identity at the nucleotide level, "similarity" 
includes diflferences between sequences which result in different amino acids that are nevertheless 
related to each other at the structural, functional, biochemical and/or conformational levels. 
Where there is non-identity at the anfiino acid level, "similarity" includes amino acids that are 
nevertheless related to each other at the structural, functional, biochemical and/or conformational 
10 levels. 

The present invention extends to nucleic acid molecules with percentage similarities of 
approximately 65%, 70%, 75%, 80%, 85%, 90% or 95% or above or a percentage in between. 

15 The nucleic acid molecule of the present invention defined by SEQ ID NO:2 is hereinafter 
referred to as constituting the "mc^4" gene. The protein encoded by mcg4 is referred to herein 
as "MCG4"and has an amino acid sequence set forth in SEQ ID NO:3. The mcg4 gene is 
proposed to encode, in accordance with the present invention, a regulator of gene expression and 
comprises a novel zinc finger domain, (HC3)2. A regulator of gene expression includes a 

20 transcription factor. Regulation may be at the level of nucleic acidrprotein or proteinrprotein 
interaction. 

The nucleic acid molecule of the present invention defined by SEQ ID NO:4 or 6 is hereinafter 
referred to as constituting the "meg 7* gene. The protein encoded by mcg7 is referred to herein 
25 as "MCG7" and has an amino acid sequence set forth in SEQ ID NO:5 or 7 and is involved in 
signal transduction. The difference in the nucleotide and amino acid sequence is due to the 
presence or absence of an exon at nucleotides 183-288. 

The nucleic acid molecule of the present invention defined by SEQ ID NO:8 is hereinafter 
30 referred to as constituting the "mcgI8'' gene. The protein encoded by mcgl8 is referred to 
herein as "MCG18" and comprises the amino acid set forth in SEQ ID NO:9. 
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The present invention extends to the naturally occurring genomic mcg4, mcg7 and meg J 8 
nucleotide sequences or corresponding cDNA sequences or to derivatives thereof. Derivatives 
contenqjlated in the present invention include fragments, parts, portions, mutants, homologues 
and analogues of MCG4, MCG7 or MCG8 or the corresponding genetic sequences. Derivatives 
5 also include single or multiple amino acid substitutions, deletions and/or additions to MCG4, 
MCG7 or MCG18 or single or multiple nucleotide substitutions, deletions and/or additions to 
mcg4, mcg7 or mcgl8, "Additions" to the amino acid or nucleotide sequences include fusions 
with other peptides, polypeptides or proteins or fusions to nucleotide sequences. Reference 
herein to "MCG4" or "mcg4", "MCG7" or "mcg7" or "MCG8" or mcgl8*' includes reference to 
10 all derivatives thereof including functional derivatives and immunologically interactive derivatives 
of MCG4, MCG7 or MCG18. 

The mcg4, mcgl and mcgl8 of the present invention are particularly exemplified herein from 
humans and in particular from human chromosome 1 lql3. 

15 

The present invention extends, however, to a range of homologues from, for example, primates, 
livestock animals (eg. sheep, cows, horses, donkeys, pigs), companion animals (eg. dogs, cats) 
laboratory test animals (eg. rabbits, mice, rats, guinea pigs), reptiles, birds (eg. chickens, ducks, 
geese, parrots), insects, nematodes, eukaryotic microorganisms and captive wild animals (eg. 
20 deer, foxes, kangaroos). Reference herein to mcg4 and meg 18 or their respective proteins 
MCG4, MCG7 and MCG18 includes reference to these molecules of human origin as well as 
novel forms of non-human origin. 

The nucleic acid molecules of the present invention may be DNA or RNA. When the nucleic 
25 acid molecule is in DNA form, it may be genomic DNA or cDNA. RNA forms of the nucleic 
acid molecules of the present invention are generally mRNA. 

Although the nucleic acid molecules of the present invention are generally in isolated form, they 
may be integrated into or ligated to or otherwise fiised or associated with other genetic 
30 nK>lecules such as vector molecules and in particular expression vector molecules. Vectors and 
expression vectors are generally capable of replication and, if applicable, expression in one or 
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both of a prokaryotic cell or a eukaryotic cell. Preferably, prokaryotic cells include E, colt 
Bacillus sp and Pseudomonas sp. Preferred eukaryotic cells include yeast, fungal, mammalian 
and insect cells. 

5 Accordingly, another aspect of the present invention contemplates a genetic construct comprising 
a vector portion and an animal, more particularly a mammalian and even more particularly a 
human mcg4 gene portion, which mcg4 gene portion is capable of encoding an MCG4 
polypeptide or a functional or immunologically interactive derivative thereof. 

10 Preferably, the mcg4 gene portion of the genetic construct is operably hnked to a promoter in 
the vector such that said promoter is capable of directing expression of said mcg4 gene portion 
in an appropriate cell. 

In addition, the mcg4 gene portion of the genetic construct may comprise all or part of the gene 
15 fused to another genetic sequence such as a nucleotide sequence encoding glutathione-S- 
transferase or part thereof. 

The present invention extends to such genetic constructs and to prokaryotic or eukaryotic cells 
comprising same. 

20 

It is proposed in accordance with the present invention that MCG4 is a transcription factor 
involved in gene regulation. Mutations in mcg4 may result in aberrations in gene regulation 
leading to the development of or a propensity to develop various types of cancer. In this regard, 
although not wishing to limit the present invention to any one hypothesis or mode of action, it 
25 is proposed that mcg4 or its expression product may be involved in the tissue-specific or 
temporal regulation of particular genes. 

A deletion or aberration in the mcg4 gene may also be important in the detection of cancer or 
a propensity to develop cancer. An aberration may be a homozygous mutation or a 
30 heterozygous mutation. The detection may occur at the foetal or post-natal level. Detection 
may also be at the germline or somatic cell level. Furthermore, a risk of developing cancer may 
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be determined by assaying for aberrations in the parents and/or proband of a subject under 
investigation. 

According to this aspect of the present invention, there is contemplated a method of detecting 
5 a condition caused or facilitated by an aberration in mcg4, said method comprising determining 
the presence of a single or multiple nucleotide substitution, deletion and/or addition or other 
aberration to one or both alleles of said mcg4 wherein the presence of such a nucleotide 
substitution, deletion and/or addition or other aberration may be indicative of said condition or 
a propensity to develop said condition. 

10 

Another aspect of the present invention contemplates a genetic construct comprising a vector 
portion and an animal, more particularly a mammalian and even more particularly a human mcg7 
gene portion, which meg? gene portion is capable of encoding an mcg7 polypeptide or a 
functional or immunologically interactive derivative thereof. 

15 

Preferably, the mcg7 gene portion of the genetic constpjct is operably linked to a promoter on 
the vector such that said promoter is capable of directing expression of said meg? gene portion 
in an appropriate cell. 

20 In addition, the meg? gene portion of the genetic construct may comprise all or part of the gene 
fused to another genetic sequence such as a nucleotide sequence encoding glutathione-S- 
transferase or part thereof. 

The present invention extends to such genetic constructs and to prokaryotic or eukaryotic cells 
25 comprising same. 

It is proposed in accordance with the present invention that MCG7 is a GEF involved in signal 
transduction. Mutations in meg? or MCG7 may result in defective control of cell proliferation 
leading to the development of or a propensity to develop various types of cancer. 

30 

A deletion or aberration in the mcg7 gene may also be important in the detection of cancer or 
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a propensity to develop cancer. An aberration may be a homozygous mutation or a 
heterozygous mutation. The detection may occur at the foetal or post-natal level. Detection 
may also be at the germline or somatic cell level. Furthermore, a risk of developing cancer may 
be determined by assaying for aberrations in the parents of a subject under investigation. 

5 

According to this aspect of the present invention, there is contemplated a method of detecting 
a condition caused or facilitated by an aberration in meg 7, said method comprising determining 
the presence of a single or multiple nucleotide substitution, deletion and/or addition or other 
aberration to one or both alleles of said meg 7 wherein the presence of such a nucleotide 
10 substitution, deletion and/or addition or other aberration may be indicative of said condition or 
a propensity to develop said condition. 

Yet another aspect of the present invention conten^jlates a genetic construct comprising a vector 
portion and an animal, more particularly a manmialian and even more particularly a human 
15 meg 18 gene portion, which meg 18 gene portion is capable of encoding an MCG18 polypeptide 
or a functional or immunologically interactive derivative thereof. 

Preferably, the mcgl8 gene portion of the genetic construct is operably linked to a promoter on 
the vector such that said pronrwter is capable of directing expression of said meg 18 gene portion 
20 in an appropriate cell. 

In addition, the megl8 gene portion of the genetic constmct may comprise all or part of the gene 
fused to another genetic sequence such as a nucleotide sequence encoding glutathione-S- 
transferase or part thereof. 

25 

The present invention extends to such genetic constmcts and to prokaryotic or eukaryotic cells 
comprising same. 

It is proposed in accordance with the present invention that MCG18 is a transcription factor 
30 involved in protein folding, protein complex assembly and transit through subcellular 
con5)artments. MCG18 may also have a role in tumour suppression. Thus mutations in meg 18 
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may r«ul. in U,e developmen. of or a propensity .o develop various rypes of cancer. 

A de^ion or i„ ^ ^ ^^^^ ^ 

a prope„s,.y .o deveiop cancer. An a^r.a.ion rr^y ^ a .on..y,„„s „„.,on or a 

he„„s™..ar.„„. Tl«de.cUo„„.yoccnra.U.foe.aiorpos..na,a.,evei. De,ec«o„ 

n.y aiso be a. U« .e^iine or so„«Uc ceH .eve,. Pu„,en„o.. a risk of developin, cancer „„y 

be de,e„mned by assayin, for abe,.Uo„s in .be pa„„.s and/or proband of L subjec. Z 

investigauon. 



A coring .o .brs aspec, of *e presen. invention, ,be„ is con,e„p,a,ed a „eU,od of de.ec.i„s 
c„„d«.o„ cans^. or ^iiir^ed by an a^^^ion in .no,,S. s.d n,e*od comprising de.en„i„i„; 
^ presence of a singie or .uiUpie n„cieo.de snbs.i..,on, de,eUon and/or add^^on or oZ 
a,.rra,.on ,o one or bo.h aUeies of said ^,,S „bere„ d,e p^sence of such a nucieo J 
subsruu-on, de,e.„„ and/or addition or oiber abe^ion may be indicaUve of said condi J„ 
15 a propensity to develop said condiUon. 

The nucleoUde subsUtutions, additions or deledons may be detected by any convenient means 
.^ludtng nucleotide sequencing, .estricdon figment length po,ymon,bism (RPLP), polymerase 
Cham reacuon (PCR), oUgonucleoUde hybridi^Uon and single stranded clrmat,^ 
^-o-Ptas", analysis (SSCP, amongst many others. An aberration includes modiftcaUon to 
extsung nucleoudes such as to modify glycosylation signal amongst Mher effects. 

In an alternative method. aberraUons in the mcg4. mc,7 and mrW« 

„_„,„„ , . . . * * ^ 8=n=s are detected by 

^'^«<^'™Sformutaaons,nMCG4.MCG7andMCG18.respectively. 

Irn'ri: T- " « ^ 0. mulUple amino acid subs.tu.ion, 

addmon and/or deleuon. ^e mu.a.on in .c... or ^,JS may also result in either no 

2^/'— or the .ntroducuon of side chain modiHcatio. .o amino acid 
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According to this aspect of the present invention, there is provided a method of detecting a 
condition caused or facilitated by an aberration in mcg4, mcg7 or meg 18 said method comprising 
screening for a single or multiple amino acid substitution, deletion and/or addition to MCG4, 
MCG7 or MCG18 wherein the presence of such a mutation is indicative of or a propensity to 
5 develop said condition. 

A particularly convenient means of detecting a mutation in MCG4, MCG7 or MCG 1 8 is by use 
of antibodies. 

10 Accordingly another aspect of the present invention is directed to antibodies to MCG4, MCG7 
or MCG 18 and its derivatives. Such antibodies may be monoclonal or polyclonal and may be 
selected fix>m naturally occurring antibodies to MCG4, MCG7 or MCG 18 or may be specifically 
raised to MCG4, MCG7 or MCG 18 or derivatives thereof. In the case of the latter, MCG4, 
MCG7 or MCG 18 or their derivatives may first need to be associated with a carrier molecule. 

15 The antibodies to MCG4. MCG7 or MCG 18 of the present invention are particularly useful as 
diagnostic agents. 

For exanple, antibodies to MCG4, MCG7 or MCG 18 and their derivatives can be used to screen 
for wild-type MCG4, MCG7 or MCG 18 or for mutated MCG4, MCG7 or MCG 18 molecules. 

20 The latter may occur, for example, during or prior to certain cancer development. A differential 
binding assay is also particularly useful. Techniques for such assays are well known in the art 
and include, for example, sandwich assays and ELISA. Knowledge of normal MCG4, MCG7 
or MCG 18 levels or the presence of wild-type MCG4, MCG7 or MCG 18 may be important for 
diagnosis of certain cancers or a predisposition for development of cancers or for monitoring 

25 certain therapeutic protocols. 

As stated above antibodies to MCG4, MCG7 or MCG 18 of the present invention may be 
monoclonal or polyclonal or may be fragments of antibodies such as Fab fragments. 
Furthermore, the present invention extends to recombinant and synthetic antibodies and to 
30 antibody hybrids. A "synthetic antibody" is considered herein to include fragments and hybrids 
of antibodies. 
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For example, specific antibodies can be used to screen for wild-type MCG4, MCG7 or MCG18 
molecule or specific mutant molecules such as molecules having a certain deletion. This would 
be important, for example, as a means for screening for levels of MCG4, MCG7 or MCG18 in 
a cell extract or other biological fluid or purifying MCG4. MCG7 or MCG18 made by 
5 recombinant means from culture supernatant fluid or purified from a cell extract. Techniques for 
the assays contemplated herein are known in the art and include, for example, sandwich assays 
and ELISA. 



It is within the scope of this invention to include any second antibodies (monoclonal, polyclonal 
10 or fragments of antibodies or synthetic antibodies) directed to the first mentioned antibodies 
discussed above. Both the first and second antibodies may be used in detection assays or a first 
antibody may be used with a commercially available anti-immunoglobulin antibody. An antibody 
as contemplated herein includes any antibody specific to any region of wild-type MCG4, MCG7 
or MCG18 or to a specific mutant phenotype or to a deleted or otherwise altered region. 

15 

Both polyclonal and monoclonal antibodies are obtainable by immunization of a suitable animal 
or bird with MCG4, MCG7 or MCG18 or its derivatives and either type is utilizable for 
immunoassays. The methods of obtaining both types of sera are weU known in the art. 
Polyclonal sera are less preferred but are relatively easily prepared by injection of a suitable 
20 laboratory animal or bird with an effective amount of MCG4, MCG7 or MCG18 or antigenic 
parts thereof or derivatives thereof, collecting serum from the animal or bird, and isolating 
specific sera by any of the known immunoadsorbent techniques. Although antibodies produced 
by this method are utilizable in virtually any type of immunoassay, they are generaUy less 
favoured because of the potential heterogeneity of the product. 

25 

The use of monoclonal antibodies in an immunoassay is particularly preferred because of the 
ability to produce them in large quantities and the homogeneity of the product. The preparation 
of hybridoma cell lines for monoclonal antibody production derived by fiising an immortal cell 
line and lymphocytes sensitized against tiie immunogenic preparation can be done by techniques 
30 which are well known to tiiose who are skilled in the art. 
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Another aspect of the present invention conteniplates a method for detecting MCG4, MCG7 or 
MCG18 or a derivative thereof in a biological sample said method comprising contacting said 
biological sample with an antibody specific for MCG4, MCG7 or MCG18 or its derivatives or 
homologues for a time and under conditions sufficient for an antibody-MCG4, MCG7 or 
5 MCG18 complex to form, and then detecting said complex. 

Preferably, the biological sample is a cell extract from a human or other animal or a bird. 

The presence of MCG4, MCG7 or MCG18 may be accomplished in a number of ways such as 
10 by Western blotting and ELISA procedures. A wide range of immunoassay techniques are 
available as can be seen by reference to US Patent Nos. 4,016,043, 4, 424,279 and 4,018,653. 
These include both single-site and two-site or "sandwich" assays of the non-competitive types, 
as well as traditional competitive binding assays. These assays also include direct binding of a 
labelled antibody to a target. 

15 

Sandwich assays are among the wost useful and commonly used assays and are favoured for use 
in the present invention. A number of variations of the sandwich assay technique exist, and all 
are intended to be enconpassed by the present invention. Briefly, in a typical forward assay, an 
unlabelled antibody is immobilized on a solid substrate and the sample to be tested brought into 

20 contact with the bound molecule. After a suitable period of incubation, for a period of time 
sufiBcient to allow formation of an antibody-antigen complex, a second antibody specific to the 
antigen, labelled with a reporter molecule capable of producing a detectable signal is then added 
and incubated, allowing time sufficient for the formation of another complex of antibody-antigen- 
labelled antibody. Any unreacted material is washed away, and the presence of the antigen is 

25 determined by observation of a signal produced by the reporter molecule. The results may either 
be qualitative, by simple observation of the visible signal, or may be quantitated by comparing 
with a control sample containing known amounts of hapten. Variations on the forward assay 
include a simultaneous assay, in which both samjle and labelled antibody are added 
simultaneously to the bound antibody. These techniques are well known to those skilled in the 

30 art, including any minor variations as will be readily apparent. In accordance with the present 
invention the san^jle is one which might contain MCG4, MCG7 or MCG18 including cell extract 
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or tissue biopsy. The sample is, therefore, generally a biological sample comprising biological 
fluid but also extends to fermentation fluid and supernatant fluid such as from a cell culture. 

In the typical forward sandwich assay, a first antibody having specificity for the MCG4, MCG7 
5 or MCG18 or an antigenic part thereof or a derivative thereof or antigenic parts thereof, is either 
covalently or passively bound to a solid surface. The solid surface is typically glass or a polymer, 
the most commonly used polymers being cellulose, polyacrylamide, nylon, polystyrene, polyvinyl 
chloride or polypropylene. The solid supports may be in the form of tubes, beads, discs of 
microplates, or any other surface suitable for conducting an immunoassay. The binding 

10 processes are well-known in the art and generally consist of cross-hnking covalently binding or 
physically adsorbing, the polymer-antibody con5)lex is washed in preparation for the test sample. 
An aliquot of the sanple to be tested is then added to the solid phase complex and incubated for 
a period of time sufficient (e.g. 2-40 minutes or overnight if more convenient) and under suitable 
conditions (e.g. from room temperature to 37 °C) to allow binding of any subunit present in the 

15 antibody. Following the incubation period, the antibody subunit solid phase is washed and dried 
and incubated with a second antibody specific for a portion of the hapten. The second antibody 
is linked to a reporter molecule which is used to indicate the binding of the second antibody to 
the hapten, 

20 An alternative method involves immobilizing the target molecules in the biological sample and 
then exposing the imnK)bilized target to specific antibody which may or may not be labelled with 
a reporter molecule. Depending on the amount of target and the strength of the reporter 
molecule signal, a bound target may be detectable by direct labelling with the antibody. 
Alternatively, a second labelled antibody, specific to the first antibody is exposed to the target- 

25 first antibody complex to form a target-first antibody-second antibody tertiary complex. The 
complex is detected by the signal emitted by the reporter molecule. 

By "reporter molecule" as used in the present specification, is meant a molecule which, by its 
chemical nature, provides an analytically identifiable signal which allows the detection of antigen- 
30 bound antibody. Detection may be either qualitative or quantitative. The most commonly used 
reporter molecules in this type of assay are either enzymes, fluorophores or radionuclide 
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containing molecules (i.e. radioisotopes) and chemiluminescent molecules. 
In the case of an enzyme immunoassay, an enzyme is conjugated to the second antibody, 
generally by means of glutaraldehyde or periodate. As will be readily recognized, however, a 
wide variety of different conjugation techniques exist, which are readily available to the skilled 
5 artisan. Commonly used enzymes include horseradish peroxidase, glucose oxidase, beta- 
galactosidase and alkaline phosphatase, amongst others. The substrates to be used with the 
specific enzymes are generally chosen for the production, upon hydrolysis by the corresponding 
enzyme, of a detectable colour change. Examples of suitable enzymes include alkaline 
phosphatase and peroxidase. It is also possible to employ fluorogenic substrates, which yield a 

10 fluorescent product rather than the chromogenic substrates noted above. In all cases, the 
enzyme- labelled antibody is added to the first antibody hapten complex, allowed to bind, and 
then the excess reagent is washed away. A solution containing the appropriate substrate is then 
added to the complex of antibody-antigen-antibody. The substrate will react with the enzyme 
linked to the second antibody, giving a qualitative visual signal, which may be further quantitated, 

15 usually spectrophotometrically, to give an indication of the amount of hapten which was present 
in the sample. "Reporter molecule" also extends to use of cell agglutination or inhibition of 
agglutination such as red blood cells on latex beads, and the like. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be chemically 
20 coupled to antibodies without altering their binding capacity. When activated by illumination 
with light of a particular wavelength, the fluorochrome-labelled antibody adsorbs the light 
energy, inducing a state to excitability in the molecule, followed by emission of the light at a 
characteristic colour visually detectable with a light microscope. As in the EIA, the fluorescent 
labelled antibody is allowed to bind to the first antibody-hapten complex. After washing off the 
25 unbound reagent, the remaining tertiary complex is then exposed to the light of the appropriate 
wavelength the fluorescence observed indicates the presence of the hapten of interest. 
Immunofluorescence and EIA techniques are both very well established in the art and are 
particularly preferred for the present method. However, other reporter molecules, such as 
radioisotope, chemiluminescent or bioluminescent molecules, may also be employed. 

30 

As stated above, the present invention extends to genetic constructs capable of encoding MCG4, 
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MCG7 or MCG18 or functional derivatives thereof. Such genetic constructs are also 
contemplated to be useful in modulating expression of specific genes in which mcg4. mcg7 or 
meg J 8 is involved in tissue-specific or temporal regulation. 

5 Accordingly, another aspect of the present invention is directed to a genetic construct comprising 
a nucleotide sequence encoding a peptide, polypeptide or protein and mcg4. mcg7 or mcgl8 or 
a functional derivative or homologue thereof capable of modulating the expression of said 
nucleotide sequence. 

10 As stated above, MCG18 is proposed to have a role in tumour suppression. Accordingly, it is 
further proposed in accordance with the present invention to use recombinant MCG18 in 
pharmaceutical preparations for treating arresting or otherwise ameliorating the effects of certain 
cancers. 

15 Accordingly, another aspect of the present invention contemplates a method for treating, 
arresting or otherwise ameliorating the effects of a cancer in an animal or bird, said method 
comprising administering to said animal or bird an effective amount of MCG18 or a functional 
derivative thereof for a time and under conditions sufficient to treat, arrest or otherwise 
ameliorate the effects of said cancer. 

20 

The present invention, therefore, contemplates a pharmaceutical composition comprising 
MCG18 or a derivative thereof or a nwdulator of mcgl8 expression or MCGl 8 activity and one 
or more phaimaceuticaUy acceptable carriers and/or diluents. These components are refeired 
to hereinafter as the "active ingredients". The active ingredients may also include anti-cancer 
25 agents or agents which facilitate actions of MCG 18. 

The pharmaceutical forms suitable for injectable use include sterile aqueous solutions (where 
water soluble) and sterile powders for tiie extemporaneous preparation of sterile injectable 
solutions. It must be stable under the conditions of manufacture and storage and must be 

30 preserved against the contaniinating action of microorganisms such as bacteria and fungi. The 
carrier may be a solvent medium containing, for example, water, ethanol. polyol (for example, 
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glycerol, propylene glycol and liquid polyethylene glycol, and the like), suitable mixtures thereof, 
and vegetable oils. The proper fluidity can be maintained, for example, by the use of a coating 
such as licithin and by the use of superfactants. The preventions of the action of microorganisms 
can be brought about by various antibacterial and antifungal agents, for example, parabens, 
5 chlorobutanol, phenol, sorbic acid, thimersal and the like. In many cases, it will be preferable to 
include isotonic agents, for example, sugars or sodium chloride. Prolonged absorption of the 
injectable conpositions can be brought about by the use in the compositions of agents delaying 
absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions are prepared by incorporating the active compounds in the required 
amount in the appropriate solvent with various of the other ingredients enumerated above, as 
required, followed by filtered sterilization. In the case of sterile powders for the preparation of 
sterile injectable solutions, the preferred methods of preparation are vacuum drying and the 
freeze-drying technique which yield a powder of the active ingredient plus any additional desired 
ingredient from previously sterile-filtered solution thereof. 

When the active ingredients are suitably protected they may be orally administered, for example, 
with an inert diluent or with an assimilable edible carrier, or it may be enclosed in hard or soft 
shell gelatin capsule, or it may be conpressed into tablets, or it may be incorporated directly with 
20 the food of the diet. For oral therapeutic administration, the active compound may be 
incorporated with excipients and used in the form of ingestible tablets, buccal tablets, troches, 
capsules, elixirs, suspensions, syrups, wafers, and the like. Such compositions and preparations 
should contain at least 1% by weight of active compound. The percentage of the compositions 
and preparations may, of course, be varied and may conveniently be between about 5 to about 
25 80% of the weight of the unit. The amount of active compound in such therapeutically useful 
compositions in such that a suitable dosage will be obtained. Preferred compositions or 
preparations according to the present invention are prepared so that an oral dosage unit form 
contains between about 0. 1 //g and 2000 mg of active compound. 

30 The tablets, troches, pills, capsules and the like may also contain the components as listed 
hereafter. A binder such as gum, acacia, com starch or gelatin; excipients such as dicalcium 
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phosphate; a disintegrating agent such as com starch, potato starch, alginic acid and the like; 
a lubricant such as magnesium stearate; and a sweetening agent such a sucrose, lactose or 
saccharin may be added or a flavouring agent such as peppermint, oil of wintergreen, or cherry 
flavouring. When the dosage unit form is a capsule, it may contain, in addition to materials of 
5 the above type, a liquid carrier. Various other materials may be present as coatings or to 
otherwise modify the physical form of the dosage unit. For instance, tablets, pills, or capsules 
may be coated with shellac, sugar or both. A syrup or elixir may contain the active compound, 
sucrose as a sweetening agent, methyl and propylparabens as preservatives, a dye and flavouring 
such as cherry or orange flavour. Of course, any material used in preparing any dosage unit form 
10 should be pharmaceutically pure and substantially non-toxic in the amounts employed. In 
addition, the active compound(s) may be incorporated into sustained-release preparations and 
formulations. 

The present invention also extends to forms suitable for topical application such as creams, 
15 lotions and gels. 

Pharmaceutically acceptable carriers and/or diluents include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents and 
the like. The use of such media and agents for pharmaceutical active substances is well known 
20 in the art. Except insofar as any conventional media or agent is incompatible with the active 
ingredient, use thereof in the therapeutic compositions is contemplated. Supplementary active 
ingredients can also be incorporated into the compositions. 

It is especially advantageous to formulate parenteral compositions in dosage unit form for ease 
25 of administration and uniformity of dosage. Dosage unit form as used herein refers to physically 
discrete units suited as unitary dosages for the mammialian subjects to be treated; each unit 
containing a predetermined quantity of active material calculated to produce the desired 
therapeutic effect in association with the required pharmaceutical carrier. The specification for 
the novel dosage unit forms of the invention are dictated by and directly dependent on (a) the 
30 unique characteristics of the active material and the particular therapeutic effect to be achieved, 
and (b) the limitations inherent in the art of compounding such an active n^terial for the 
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treatment of disease in living subjects having a diseased condition in which bodily health is 
impaired as herein disclosed in detail. 

The principal active ingredient is compounded for convenient and effective administration in 
5 effective amounts with a suitable pharmaceutically acceptable carrier in dosage unit form as 
hereinbefore disclosed. A unit dosage form can, for example, contain the principal active 
compound in amounts ranging from 0.5 |ig to about 2000 mg. Expressed in proportions, the 
active compound is generally present in from about 0.5 |ig to about 2000 mg/ml of carrier. In 
the case of compositions containing supplementary active ingredients, the dosages are 
10 determined by reference to the usual dose and manner of administration of the said ingredients. 

Effective amounts contemplated by the present invention include those amounts effective to 
ameliorate a condition. For example, it is envisaged that effective amounts would range from 
about 0.001 //g/kg body weight to about 100 mg/kg body weight. Alternatively, effective 
15 amounts of about 0.01 //g/kg body weight to about 10 mg/kg body weight or even 0. 1 /^g/kg 
body weight to about 1 mg/kg body weight. Administration may be per minute, hour, day, week, 
month or year or may only be a once off administration. 

The pharmaceutical conposition may also comprise genetic molecules such as a vector capable 
20 of transfecting target cells where the vector carries a nucleic acid molecule capable of modulating 
meg 18 expression or MCG18 activity. The vector may, for example, be a viral vector. 

As stated above, the present invention further contemplates a range of derivatives of MCG18. 

Derivatives include firagments, parts, portions, mutants, homologues and analogues of the 
25 MCG18 polypeptide and corresponding genetic sequence. Derivatives also include single or 

multiple amino acid substitutions, deletions and/or additions to MCG18 or single or multiple 

nucleotide substitutions, deletions and/or additions to the genetic sequence encoding MCG18. 

"Additions" to amino acid sequences or nucleotide sequences include fusions with other 

peptides, polypeptides or proteins or fusions to nucleotide sequences. Reference herein to 
30 **MCG18" includes reference to all derivatives thereof including functional derivatives or MCG18 

immunologically interactive derivatives. 



wo 98/53061 



-32- 



PCT/AU98/00380 



5 



Analogues of MCG18 contemplated herein include, but are not limited to. modification to side 
chains, incorporating of unnatural amino acids and/or their derivatives during peptide, 
polypeptide or protein synthesis and the use of crosslinkers and other methods which impose 
conformational constraints on the proteinaceous molecule or their analogues. 



Examples of side chain modifications contemplated by the present invention include 
modifications of amino groups such as by reductive alkylation by reaction with an aldehyde 
foUowed by reduction with NaBH4; amidination with methylacetimidate; acylation with aceUc 
anhydride; carbamoylation of amino groups with cyanate; trinitrobenzylation of amino groups 
10 with 2. 4. 6-trinitrobenzene sulphonic acid (TNBS); acylation of amino groups with succinic 
anhydride and tetrahydrophthalic anhydride; and pyridoxylation of lysine with pyridoxal-5- 
phosphate followed by reduction with NaBH4. 

The guanidine group of arginine residues may be modified by the formation of heterocyclic 
15 condensation products with reagents such as 2.3-butanedione. phenylglyoxal and glyoxal. 

The carboxyl group may be modified by carbodiimide activation via O-acylisourea formation 
followed by subsequent derivitisation, for example, to a corresponding amide. 

20 Sulphydryl groups may be modified by methods such as carboxymethylation with iodoacetic acid 
oriodoacetamide; peiformic acid oxidation to cysteic acid; formation of a mixed disulphides 
with other thiol compounds; reaction with maleimide. maleic anhydride or other substituted 
maleimide; formation of mercurial derivatives using 4-chloromercuriben2oate, 4- 
chloromercuriphenylsulphonic acid, phenylmercury chloride. 2-chloromercuri-4-nitrophenol and 

25 other mercurials; carbamoylation with cyanate at alkaline pH. 

Tryptophan residues may be modified by, for example, oxidation with N-bromosuccinimide or 
alkylation of tiie indole ring with 2-hydroxy-5-nitrobenzyl bromide c,- sulphenyl halides. 
Tyrosine residues on the other hand, may be altered by nitration with tetranitromethane to form 

30 a 3-nitrotyrosine derivative. 
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Modification of the imidazole ring of a histidine residue may be accomplished by alkylation with 
iodoacetic acid derivatives or N-carbethoxylation with diethylpyrocarbonate. 

Examples of incorporating unnatural amino acids and derivatives during peptide synthesis 
5 include, but are not limited to, use of norleucine, 4-amino butyric acid, 4-amino-3-hydroxy-5- 
phenylpentanoic acid, 6-aminohexanoic acid, t-butylglycine, norvaline, phenylglycine, ornithine, 
sarcosine, 4-amino-3-hydroxy-6-methylheptanoic acid, 2-thienyI alanine and/or D-isomers of 
amino acids. A list of urmatural amino acids, contemplated herein is shown in Table 3. 
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TABLE 3 



Non-conventional 
amino acid 

5 




Non-conventional 
amino acid 


Code 


ct-aininobutyric acid 


Ahn 


L-N-methylalanine 


Nmala 


cc -amino- a-methvlhi itvratp 


IVlgdDU 


L-N-methylarginine 


Nmarg 


aminopvploniTiT^an/*— 


Cpro 


T XT >l1 1 • 

L-N -methylasparagme 


Nmasn 


carboxylate 




L-N-methylaspartic acid 


Nmasp 


10 aminoisobutyric acid 


Alh 


L-N-methylcysteine 


Nmcys 


aminonorbomvl- 

******** IV/* W/XXA T* 




L-N-methyiglutamine 


Nmgln 


carhoxvlate 




L-N-methylglutanuc acid 


Nmglu 


C vcl oh Y V 1 J* 1 5* n i n ^ 


Chexa 


L-N-methylhistidine 


Nmhis 


C VC 1 OTV* n 1 ii 1 51 n 1 n 


i-^pen 


T X.T . a1_ 1" 11 • 

L-N-methylisoUeucine 


Nmile 


1 5 D-alaninp 


uai 


L-N-methylleucine 


Nmleu 


O-arpininf* 


Darg 


¥ XT aI_ tt 

L-N-methyllysme 


Nmlys 


D-asnartic acid 


Dasp 


L-N-methylmethiomne 


Nmmet 


D-cv^teinp 


Dcys 


T XT 1 1 • 

L-N-methylnorleucme 


Nmnle 


D- clutamine 


j^gin 


L-N-methylnorvaline 


Nmnva 


20 D-elutamic acid 


Fieri 11 


L-N-methylomithine 


Nmom 


D-histidine 


Dhic 


L-iN-metnylpnenylalamne 


Nmphe 


D-isoleucine 




i^-i>i -meiny iproiine 


Nmpro 


D-leucine 




L.-IM-metnylsenne 


Nmser 


I—' 1 y oiiic 


LJlys 


L-N-methylthreonine 


Nmthr 


S r^-rnAthir*nin<a 
i-/ lllCUJiL/IllIlC 


umet 


L-N-methyltryptophan 


Nmtrp 


D-omithine 


Dom 


L-N-methyltyrosine 


Nmtyr 


D-phenylalanine 


Dphe 


L-N-methylvaline 


Nmval 


D-proline 


Dpro 


L-N-methylethylglycine 


Nmetg 


D-serine 


Dser 


L-N-metiiyl-t-butylglycine 


Nntbug 


30 D-threonine 


Dthr 


L-norleucine 


Nle 


D-tryptophan 


Dtrp 


L-norvaline 


Nva 
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D- tyrosine 


Dtyr 


a-methyl-aminoisobutyrate 


Maib 


D-valine 


Dval 


a-methyl-y-aminobutyrate 


Mgabu 


D- a-methy lalanine 


Dmaia 


a-methylcyclohexylalanine 


Mchexa 


D-a-methylarginine 


Dmarg 


a-methylcylcopentylalanine 


Mcpen 


5 D-a-methylasparagine 


Dmasn 


a-methyl-a-napthylalanine 


Manap 


D- a-methy laspartate 


Dmasp 


a-methylpenicillamine 


Mpen 


D- a-methy Icy steine 


Dmcys 


N-(4-aminobutyl)glycine 


Nglu 


D-a-methylglutamine 


Dmghi 


N-(2-aminoethyl)glycine 


Naeg 


D- a-methy Ihistidine 


Dmhis 


N-(3-aminopropyI)glycine 


Nom 


10 D-a-methylisoleucine 


Dmile 


N-amino-a-methylbutyrate 


Nmaabu 


D-a-methylleucine 


Dmleu 


a-napthylalanine 


Anap 


D-a-methyllysine 


Dmlys 


N-benzylglycine 


Nphe 


D-a-methylmethionine 


Draumet 


N-(2-carbamylethyl)glycine 


Ngln 


D-a-methylomithine 


Dmom 


N-(carbamyhnethyl)glycine 


Nasn 


1 5 D-a-methylphenylalanine 


Dmphe 


N-(2-carboxyethyl)glycine 


Nglu 


D-a-methylproline 


Dmpro 


N-(carboxymethyI)glycine 


Nasp 


D-a-methylserine 


Dmser 


N-cyclobutylglycine 


Ncbut 


D-a-methylthreonine 


Dmthr 


N-cycloheptylglycine 


Nchep 


D- a-methy Itryptophan 


Dmtrp 


N-cyclohexylglycine 


Nchex 


20 D-a-methyltyrosine 


Dmty 


N-cyclodecylglycine 


Ncdec 


D-a-methylvaline 


Dmval 


N-cylcododecylglycine 


Ncdod 


D-N-methylalanine 


Dnmala 


N-cyclooctylglycine 


Ncoct 


D-N-methylarginine 


Dnmarg 


N-cyclopropylglycine 


Ncpro 


D-N-methylasparagine 


Dnmasn 


N-cycloundecylglycine 


Ncund 


25 D-N-methylaspartate 


Dmnasp 


N-(2,2-diphenylethyl)glycine 


Nbhm 


D-N-methylcysteine 


Dnmcys 


N-(3,3-diphenylpropyl)glycine 


Nbhe 


D-N-methylglutamine 


Dnmgin 


N-(3-guanidinopropyl)glycine 


Narg 


D-N-methylglutamate 


Dmnglu 


N-( 1 -hydroxyethy Oglycine 


Nthr 


D-N-methylhistidine 


Dmnhis 


N-(hydroxyediyl))glycine 


Nser 


30 D-N-methylisoleucine 


Ehumle 


N-(imidazolylethyl))glycine 


Nhis 


D-N-methylleucine 


Dnmleu 


N-(3-indolylyethyl)glycine 


Nhtrp 
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D-N-methyllysine 

N-methylcyclohexylalanine 
D-N-methylomithine 
N-methylglycine 
5 N-methylaminoisobutyrate 
N-(l-methyJpropyl)glycine 
N-(2-methylpropyI)glycine 
D-N-methyltryptophan 
D-N-methyltyrosine 
10 D-N-methyl valine 
Y-aminobutyric acid 
L-r-butylglycine 
L-ethylglycine 
L-homophenylalanine 
15 L-a-methylargxnine 
L-a-methylaspartate 
L-a-methylcysteine 
L-a-methylglutamine 
L-a-methylhistidine 
20 L-a-methylisoleucine 
L-a-methylleucine 
L-a-methylmethionine 
L-a-methylnorvaline 
L-a-methylphenylalanine 
25 L-a-methylserine 

L-a-methyltryptophan 



DnmJys 

Nmchexa 

Dnmom 

Nala 

Nmaib 

Nile 

NIeu 

Dnmtrp 

Dnmtyr 

Dnmval 

Gabu 

Tbug 

Etg 

Hphe 

Marg 

Masp 

Mcys 

Mgin 

Mhis 

Mile 

Mleu 

Mmet 

Mnva 

Mphe 

Mser 

Mtrp 



N-methyl-Y-aminobutyrate 

D-N-methylmethionine 

N-methylcyclopentylalanine 

D-N-methylphenylalanine 

D-N-methylproline 

D-N-methylserine 

D-N-methylthreonine 

N-(l-methyIethyl)glycine 

N-methyla-napthylaianine 

N-methylpenicillamine 

N-(p-hydroxyphenyl)glycine 

N-(thioniethyl)glycine 

penicillamine 

L-a-methylalanine 

L-a-methylasparagine 

L-a-methyl-r-butylglycine 

L-methylethylgiycine 

L-a-methylglutamate 

L-a-methylhomophenylalanine 

N-(2-methyIthioethyl)glycine 

L-a-methyllysine 

L-a-methylnorleucine 

L-a-methylomithine 

L-oc-methylproline 

L-a-methylthreonine 

L-0£-methyItyrosine 



Nngabu 

Etammet 

Nnxpen 

Dnnphe 

Dnnpro 

Dnmser 

Dnmthr 

Nval 

Nmanap 

Nmpen 

Nhtyr 

Ncys 

Pen 

Mala 

Masn 

Mtbug 

Metg 

Mglu 

Mhphe 

Nmet 

Mlys 

Mnle 

Mom 

Mpro 

Mthr 

Mtyr 
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L-a-methylvaline Mval 
N-(N-(2,2-diphenyIethyl) Nnbhm 
carbamylmethyOglycine 
1 -carboxy- 1 -(2,2-diphenyl- Nmbc 
5 ethylaiTiino)cycIopropane 



L-N-methylhomophenylalanine Nmhpte 
N-(N-(3,3-diphenylpropyl) Nnbhe 
caibamylmethyOglycine 



Crosslinkers can be used, for example, to stabilise 3D conformations, using homo-bifunctional 
crosslinkers such as the bifiinctional imido esters having (CH2)n spacer groups with n=l to n=6, 

10 glutaraldehyde, N-hydroxysuccinimide esters and hetero-bifunctional reagents which usually 
contain an amino-reactive moiety such as N-hydroxysuccinimide and another group specific- 
reactive moiety such as maleimido or dithio moiety (SH) or carbodiimide (COOH). In addition, 
peptides can be conformationally constrained by, for example, incorporation of and IS^ - 
methylamino acids, introduction of double bonds between C„ and Cp atoms of amino acids and 

15 the formation of cyclic peptides or analogues by introducing covalent bonds such as forming an 
amide bond between the N and C termini, between two side chains or between a side chain and 
the N or C terminus. 

Such analogues also apply in respect of MCG4 and MCG7. 

20 

The present invention further contemplates chemical analogues of MCG18 capable of acting as 
antagonists or agonists of MCG18 or which can act as functional analogues of MCG18. 
Chemical analogues may not necessarily be derived from MCG18 but may share certain 
conformational similarities. Alternatively, chemical analogues may be specifically designed to 
25 mimic certain physiochemical properties of MCG18. Chemical analogues may be chemically 
synthesised or may be detected following, for example, natural product screening. 

The identification of MCGIS permits the generation of a range of therapeutic molecules capable 
of modulating expression of MCG18 or modulating the activity of MCG18. Modulators 
30 contemplated by the present invention includes agonists and antagonists of MCG18 expression. 
Antagonists of MCG18 expression include antisense molecules, ribozymes and co-suppression 
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molecules. Agonist taclude n«tecules which Increase promoter abilio- or in-erfere wiu, negauve 
regula,o„ mechanisms. AgomsU of MCG18 include molecules which overcome any negadve 
regulatory mechanism. Antagomsu of MCGI8 include antibodies and im.ibi.or pepUde 
fragments. 

5 

These types of modifications may be important to stabilise MCG.8 if administered to an 
mdividual or for use as a diagnostic reagent. 

Other dedvatives contemplated by the pt^entinventionincludearange Of glycosy^^^ 

from a completely unglycosylated molecule to a mc^fled glycosylated molecule Altered 

glycosylation patterns may result from expression of recombinant molecules in differem host 

cells. 



.5 TToTT for modulating exptession 

of MCG18n,ahuman.saidmeti,odcomprisi„gco„,a«i„gti,e^gy«ge„e encoding MCG18 
wtti, an effective amount of a modulator of „c,« expression for a time and under conditions 
sufBctent to up-regulate or down-regulate or otherwise modular expression ot^c.lS For 
example, a nucleic acid molecule encoding MCGl 8 or a derivative ti,e,eof may be intioduced 
.mo a cell to faciliute protection of U.at cell from becoming cancerous 

20 

Another aspect of the present invention contemplates a ™«hod of n«dulating activity of MCGI8 
m a hutnan, said method con^g administering to said „^ a modulating effective amount 
of a ™,feeufe for a time and under conditions sufficient to increase or decrease MCGl 8 activity 

25 TuT^T T ' ' " -"-^ also be a derivative 

/5 ot MCG18 or a chemical analogue or nuncation mutant of MCGI8. 



The ptesent invention is further described witi, reference to tt,e following n„„-U,„iting 



Examples. 
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EXAMPLE 1 

A human gene (designated mcg4) was identified on chromosome 1 lql3 that on the basis of 
sequence homology is predicted to encode a putative transcription factor of 310 amino acids 
5 (Fig. 1). mcg4 is transcribed in several different cell hnes (Fig. 7). 

EXAMPLE 2 

The expressed sequence tag (EST) database contains partial sequence data for the murine (Fig. 
10 2) and nematode (Fig. 3) homologues of mcg4. 

EXAMPLE 3 

MCG4 contains a sequence of cysteine residues within the N-terminal region of the protein that 
15 resembles zinc-finger binding domains of a novel type, ie. (HC3)2 [Fig. 4]. 

EXAMPLE 4 

Sensitive sequence homology searches reveal that related cysteine-containing motifs are present 
20 in another C elegans protein (Fig. 5) as well as the GATA-binding transcription factor from 5. 
pombe (Fig, 6). 

EXAMPLE 5 

25 mcg4 will have commercial value due to its likelihood of encoding a novel transcription factor 
that is highly conserved amongst organisms, thus suggesting an integral role in gene regulation. 
mcg4 may also be involved in some way in tissue-specific or temporal regulation of certain genes, 
thus making it a potential target for modulating expression of those downstream effectors. 



30 
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EXAMPLE 6 



NucleoUde sequence da« genera«d tam ccsmid clone cSRL.72c4 wift U,e T7 primer 

5 7*^"^=^™ .e— scqnencingki.) was al^ned .„ 

^'^^'^ =~ Sequence Tag (EST) da^base using U,e prog™. BLASTO (Altschu, 
1990) and was foum. ,o match numen,„s human and mouse encries (Table 4 and Rgure 2) 
These matching ESTs were fu,*er used ,o idenUiy overlapping entries in the EST database 
(Table 5). The nucleoUde sequences of these human ESTs were compUed using MacVector 
4.2.1 softwa« (ffil-Kodak) to produce the cDNA sequence shown in Figure 1 EST entries 
10AA0.4™3a„dAA,34,Saa..close..e.teda.,hen„c.^^^^^ 

likely that mcg4 is a member of a newly discoveted gene family (Figure 8). 

•n. CDNA sequent of was translated in all possible teading frames and compared to the 

*"B-*"<'n--i„nda„.pn«einda,abaseusingthep,og,3mBLASTX(Altschul«.i; .990)a, 
15 the National Center for Biotechnology InformaUon (h.,p//www.ncbi.m,.gov.nhn) As the 
P|Ote„ appeated to be «,vel. a translation of the longest teading frame for the mc,4 cDNA was 
altgited to the EST database using the program TBLASTN, which perfonned a dynamic 
^on Of Ote EST database in al, 6 frames. The seat^h „=su,u indicated Utat the nematode 
C. e/.,<^ had an MCG4-li,« p^tein (Figune 3), with the matching domains containing a spaUa, 
20 ^uence ofCysteine ^ Histidine residues which trembled a zinc-ftnger structure (Figure 4) 
^e P,.g^ BLASTP was used, therefore. ,o conduct sensiUve seareh« of the ptotein 
databases or similar zinc-ftnger motifs. A weak match to the putaUve zinc-finger domain was 
ob«rved for another ptotein from C. eU,ons (Figure 5) and a poorer match for the GATA- 
btndmg transcription factor from 5. po;^(Figme 6). The putaUve iniUation codon of human 
jc,4 . not preceded by an in-fi». stop codon and it is therefore possible that the cDNA 
^^dtnFigure , is a truncated form. However, sequence aligmnen, of human and mouse 
.^Z4 ESTS Showed a lower degree of nucleotide conse„ari„„ prior to the assigned initiaUon 
codon, thus supporting the notion that the region represents the 5' UTR (Figure 9) To 
determme the expression pattem of ™c,., 15.g of the total ceUuiarRNA (RNeasy Mini Kit 

^S^!I.T: ""^ " "^-^ ''-"P-o-ed through . .2% w/J 

MOPS/fonnakiehyde gels and blotted onto nylon membranes (Amersham) by capUlary transfer 
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using 20 X SSC (Sambrook et al, 1989). Filters were subsequently UV-fixed and hybridised 
overnight at 65°C to a radiolabelled (^^P-dCTP) cDNA probe (Church and Gilbert, 1984) for 
mcg4. After washes in 0.1 x SSC/0.1% w/v SDS at 65*^0 for 1 hour, the filters were air-dried 
and exposed to X-ray film. This Northern analysis showed that mcg4 is expressed as a 1.6kb 
5 message in numerous tissues including breast, ovary, bladder, lung and keratinocytes (Figure 7). 

EXAMPLE 7 

A human gene (designated mcgT) was identified and isolated from chromosome 1 lql3 which 
10 encodes a protein that bears striking homology with guanine nucleotide exchange factors (GEFs) 
from a wide variety of organisms (Fig. 12). 

EXAMPLE 8 

15 The composite mcg7 cDNA sequence is at least 2.4kb in length and Figure 13(a) shows a 
predicted translation product of at least 609 amino acids beginning at methionine 120. An 
alternative start site due to alternate exon splicing (indicated in lower case) may yield a protein 
of 671 amino acids starting at methionine 58 (Fig. 13a). 

20 EXAMPLE 9 

An mcg7 homologue from C. elegans has been identified, the product of which is highly 
conserved with that of MCG7 (Fig. 14). There are several salient features of the protein which 
have been underlined in Fig. 14 - namely: a guanine nucleotide binding region, a diacylglycerol 
25 binding region, and "EF-hand"-calcium binding regions. In addition, there are several potential 
cAMP, protein kinase C, and casein kinase 11 phosphorylation sites, as well as a number of 
potential sites for glycosylation (not indicated). 

EXAMPLE 10 

30 

A number of partial human and murine EST clones exist for mcgl. The GenBank database 
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~ a cDNA „o. V, 2336, e.„<.., a <U„.fc„^, open r^, ^ ,oRP, H™. 
'.c,7 as well a. a paniai murine „c,7 ORF (Y ,2339). In addiUon, .he complete genon^c 
sequence of d:e human ™.^7gene is contained wi*i„ GenBa«k en«y AC00O134 



EXAMPLE 11 



^ be. characerised OEFs a. „.mhe. of*, i^ o,^ „„cop^,„,, ^ 
^ie a, s^nai „an^„c«on and when nu^ ^ deveiopmen.. A Le.y 

Of .hempen., .gimes for c^r ,rea.me„, have heen designed .o specific^,, i„,rf.„ J 
.0 .^.gnaiang pa«,ways. T.ere is po.n.ia,. .he.fo. U,a, tt,e product of „c,Zcou,d also he 
target for such clinical strategies. 



EXAMPLE 12 



The nuc cotKie sequence for ^,7 cDNA was extended , with genomic DNA sequence from 
G^nbank acces.cn numher AC000,34 (posiUons ,-321) and analysed for additional coding 
sequent 5 to the putative initiattan codon („t 681-683) (Fig. ,6). An additional in-frame ATG 
.^u.. at positton n. 495-497 when the alternatively splice exon (position n, 504-609, is present 
(also shown u, Pig. 13(a)). ™. closely matches the Ko^ consensus. When L 

:^ :'"'"'^"°'"'^^-°*=--'^---"«---en.(ii^ 

■ra^^n Shown ^lowercaseletteHng) (also shown in Fig. ,3(b,). Futther evidence that the 
.nmatton codon at position nt 68 1-683 is the tnte imUation site is given in Figu,. ,5. 

Ah-gnment of human and a partial murine ™c,7 cDNA sequences is shown in Figu. 15 The 
25 putauve n^tiation c«.o„ is at position nt 360-362. Both muHne ESTs appear to have an 

"^IT^ " ''-■"^ cf the differentiaUy spliced 

exon and the sequetK. alignment thus suggests that this tegion represents the 5' UTR of 1,7 

Futthennor^ Similar,, with the C. e,e,ans homologue strongly suggest Utat the ATG codon at 
30 position nt 360-362 encodes the N-terminus of MCG7. 
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EXAMPLE 13 

Rg„« 17 Shows data fn^rn experiments Mcaung ma. a ttuncaced version of MCG7 when 
expressed as a OST fusion protein (eonstn.c, B in Fi«. .8, can ft-ncUon as a Ras-gua-une 
5 „„ leoUde exchange factor. In hrief, Ras (unprocessed and as a OST fusion protetn ts io ded 
wi.h ■H-GDP then tacubated the presence of excess coid OTP . GST-MCG7. Full detaUs of 
this assay can be found in Porfiri et al. 

EXAMPLE 14 

Nucleotide se<,uence data generated f^n, cosn.d clone cSRI..20hl2 with the T7 pritner 
(Promega, and AppUed Biosystems Incorporated dye tenninator se,ue„cing Id., were a^tgned 
o .he Expressed Se,„e,^ Tag (EST, database us^g *e program BLASTN (Al.chul 

e, 1990, and was found .o tnatch OenBan. entries T78563 (clone 1 13434, TO9103 (clone 
.5 HIBBP12,andAA035643(clone471M9). EST clones n3434and471819 wereobtained from 
Genon« Systems Inc. and these DNAs were sequenced on both strands with ge„e-sp^,f.c 
primers (Table 5, to generate the cDNA sequence of mcs7 shown in Figures 13(a, and (b,. 

The CDNA sequence of n^gl was translated in aU possible reading frames and compared to the 
20 GenBanknon-redu„dan.pro.einda«baseustagU.progr3mBLASTX(AltschuU,a,. 1990, and 
the coding region was assigned on the b^is of showing homology to «,e C. eUsans pro.e,n 
F25B3 3 (Ftgure 14,. The mcg7 cDNA composite was suspected <o contain a single nucleoude 
error that originated from clone 471819 and the correct nucleotide sequence was, therefore, 
sought by teverse „anscription-polymerase chain reaction (RT-PCR, of the cDNA fragment 
25 from a human CDNA pool. Total RNA was extracted from a human lymphoblastoid ceU bne 
using an RNeasy Mini Kit (Qiagen,. cDNA synthesis was conducted with the reverse 
transcriptase Superscript U RNaseH- (OBCO. BRL, and random hexamers using the procedure 
recommended by the manufacmrer (GBCO, BRL,. One forrie* of the cDNA mix w^ 
subjected to 35 cycles of PCR using the following cycling conditions: 94-C for 30 seconds, 58 C 
30 for 30 seconds and 72^^ for 90 seconds. The 50^1 reaction mix consisted of Ix reaction buffer 
(Dade Sciendfc). 2mM dNTP mix. 20pmol of primers (see Table 6, MCG7UF (within the 
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variably spliced exon of Figure 13(b), between nucleotide positions 184-201) and SGCADRV2 
(between nucleotide positions 866-846 of Figure 13(a)) and 10 units of Dynazyme (Dade 
Scientific). The resulting PCR product was cloned into the pGEM-T vector (Promega) using 
standard methodology and sequenced using gene-specific primers. The correct nucleotide 
5 sequence oimcgl (as shown in Figure 13(a)) matches that of the recently release GenBank entry 
Y12336. A partial nwuse mcgl cDNA sequence can also be found in GenBank entry Y12339. 

EXAMPLE 15 

10 The coding sequence of mcgl was cloned into vectors for expression in both bacterial and 
mammalian cells. In addition to the fiill-length constructs, the deletion constructs shown in 
Figure 18 were designed to retain the guanine nucleotide exchange (GEF) domain. For 
prokaryotic expression, the mcgl coding region was inserted downstream of and in-frame with 
the Sj26 cassette of the pGEX (Pharmacia) series of vectors (Smith and Johnson, 1988) using 

15 standard cloning techniques (Sambrook et al, 1989). For mammalian expression, the mcgl 
coding sequence was first myc-tagged at the N-terminus and then ligated into the expression 
vector pc Exv-n using standard cloning techniques. Ligation junctions of the constructs were 
sequences as the cloning strategies inadvertendy changed or introduced additional amino acids 
as shown below. 

20 

Construct (A): EST clone 1 13434 was digested with Apal (Figure 13(a), nucleotide positions 
1022 to >2416 (within the vector)), blunt-ended with T4 DNA polymerase according to the 
specifications of the manufacturer (New England Biolab) and ligated into the Smal site of pGEX- 
3X. 

25 

Sequence of the pGEX and mcgl (underlined) junction: 
pGEX-3X mcgl (1022) 

Sj26 ... GOG ATC CCC CTG GTC [SEQ ID NO: 19] 

additional amino acids Gly lie Pro 

30 

Construct (B): EST clone 113434 was digested with EcoRl (Figure 13(a), nucleotide 
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positions <695 (within the vector) to 1711) and iigated into the EcoRI site of pGEX-1. 

Sequence of the pGEX and mcgl (underlined) junction: 
pGEX-1 mcgl (695) 

5 Sj26 ... GAA TTC GGC ACG A GC CGA CGG [SEQ ID NO:20] 

additional amino acids Glu Phe Gly Thr Ser 

Construct (C): fijU-length mcgl: The pGEM-T clone containing the 5' end of the mcgl coding 
region was digested with A/?aI (subsequently blunt-ended with T4 DNA polymerase) and BstXl 
10 to liberate the fragment between nucleotide positions 336 and 830 of Figure 13(a). Clone 
1 13434 was digested with BstXl and Hindlll (vector derived) to liberate a fragment between 
nucleotide positions 830 > and 2416 (vector derived) of Figure 13(a). A pGEM-1 Izf vector 
(Promega) containing the myc-tag was digested with Apal (subsequently blunt-ended with T4 
DNA polymerase) and Hindm, and Iigated with the 2 inserts described above. 

15 

Sequence of the m>'c-tag/mcg 7 junction [SEQ ID NOs:21/22]: 

myc-tag vector BairtHI mcgl 5' UTR (337) start 

ATGGAGCAGAAGCTGATCTCCGAGGAGGACCTG CCCGGGGCAGCTggatCcG CAGCCCACCCCGCGCCGGCGGCCATG 
20meqkliseedl PGAAGS AAHPAPAAM 

additional amino acids 

The myc-tagged full-length mcgl insert in pGEM-1 Izf was then excised with Sad and Hindni 
(both vector derived) and directionally cloned into the mammalian expression vector pEXV 
25 (Beranger era/, 1994). 

Construct (D): Constract (C) in pGEM- 1 Izf was sequentially digested with Hindni (this site 
was subsequently blunt-ended with T4 DNA polymerase) then BamHl, and Iigated into pGEX- 
2T digested with BamHl and Smal. Digestion with BamHl, and Iigated into pGEX-2T digested 
30 with BamHl and SmaL Digestion with BamHl removed the myc-tag of Construct (C). 

Sequence of the pGEX and mcg7 [SEQ ID NO:23/24] (underlined) junction: 
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pGEX-2 BamHl meg? (337) 

Sj26 ■ ■ ■ gga tec GCA GCC Cftr rcc am nnn. r^^r^r,^ 

Gly Ser Ala Ala His Pro Ala Pro Ala Ala Met 
additional amino acids 



EXAMPLE 16 

Overnight bacterial cultures containing the pGEX plasmid were used to inoculate 500ml of Luria 
Broth media containing 50,.g/ml ampicillin. The cultures were grown to an OD of -0 8 and then 
10 induced with ImM of IFTG for up to 3 hours at 37»C. The bacteria were pelleted and 
resuspended in 15 ml of STE buffer (lOmM Tris pH 8.0. 150 mM NaCl and ImM EDTA) with 
1 mg/ml lysozyme. Tl« mixture was left on ice for more than 1 hour and subsequent steps were 
performed at 4°C. Protease inhibitors aprotinin, pepstatin and leupeptin were added at fmal 
concentrations of 25^g/ml. prior to the addition of Triton-X-100 (2% v/v final) and n-lauroyl 
15 san:osine (1.5% w/v final). The lysate was sonicated for ^1 minute and pelleted at 14 000 x g 
for 15 minutes. 100 ^1 of 50% w/v glutathione-sephadex bead slurry (in PBS) was added per 
ml of supernatant. Following a 30 minute incubation at 40C, the beads were washed thr^e times 
with NETN (20mM Tris-HCl pH 8.0. lOOmM NaCl. ImM EDTA. 0.5% NP40). once with 
NETN-HS (equivalent to NETN but with IM NaCl). and once in NETN. The bound protein 
20 was direcUy analysed by SDS-polyacrylamide gel electrophoresis (PAGE) as described below 
or the bound protein was eluted from the beads with the following elution buffer (50mM Tris pH 
8.0. 150mM NaCl, 5mM MgCl, ImM DTT. lOmM r^uced glutathione) for use in GDP release 
assays. 



25 



EXAMPLE 17 



Twenty microiitres of GST-sepharose-bound MCG7 were added to an equal volume of 2 x 
30 sample loading dye (lOOmM Tris pH6.8, 2% v/v mercaptoethanol, 4% w/v SDS 0 2% w/v 
bromophenol blue. 20% v/v glycerol), boiled for 5 min and loaded onto a 7.5% w/v SDS-PAGE 
geI(Sambrook./a/.1989). The Coomassie brilliant blue stained gel (Sambrook a/ 1989) 
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typically displayed a protein doublet, running between 87-95 kDa consisting of the MCG7-GST 
fusion and a slightly smaller, co-purified contaminating E, coli protein of -lOSkDa. The 
calculated molecular weight of full-length MCG7 is 77.5 kDa (Construct (D)) and the GST 
component has a molecular weight of 26kDa, hence, the recombinant protein runs slightly 
5 smaller than predicted. A Western blot of the same gel probed with anti-GST antibody yields 
an MCG7-specific band at the same position as that of the stained gel. 

EXAMPLE 18 

10 Assumptions: (a) GST-Ras molecular weight = 50 kD; (b) Concentration of GST-Ras solution 
= Img/ml = 20AiM; (c) [^H]-GDP is ImCi/ml and 13.3Ci/mmol, therefoife [ HJ-GDP 
concentration = 75 and Ipmol [^H]-GDP= 15,466 cpm; (d) Elution buffer = Buffer E = 20 
mM Tris-Cl, pH7.5; 50mM NaCl; SmM MgClj; ImM DTT (added just before use). Buffer E 
+ BSA= Buffer E+lmg/ml BSA (added just before use). 

15 

Mix together, in the following order and mix well after each addition: 

IOmI (=10Aig) GST-Ras (@ Img/ml in Buffer E), 463^1 Buffer E + BSA, [^H]-GDP, 10ml 
490 mM EDTA. Incubate @ RT for 10 min. Add IO/2I 0.5 M MgClj and mix well. Incubate 
@ RT for 10 min. Place on ice. During the first incubation the excess EDTA concentration is 
20 5mM, during the second incubation the excess Mg concentration is 5mM. The [^H]-GDP 
concentration is I/2M and the final concentration of GST-Ras is 400nM. Thus 20ml of the final 
mix will contain 8pmol of GST-Ras protein. Specific activity of GDP is 15,446 cpm/pmol x 
(1/1.4)= 11,047 cpm/pmol. 

25 EXAMPLE 19 

Exchange Ras with labelled GDP as above. Add unlabelled GTP (stock = lOOmM, pH7) to 1 
mM. Adjust Mg concentration by adding 5//1 0.5 EDTA to labelled Ras, 5//1 0.5M EDTA to 
500//1 MCG7, and 5//1 0.5M EDTA to 500/zl Buffer E -f BSA. On ice set up microfuge tubes 
30 with 40//1 Ras-GDP (in tripUcate) with 40//1 MCG7 or Buflfer E + BSA (control). Transfer tubes 
to heat block @ 25°C and incubate for 10, 20 or 30 min. Stop exchange reactions with 1ml of 
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ice cold buffer E and place on ice. Pre-soak nitrocellulose filters, pore size 45/zm, in Buffer E. 
Assemble the vacuum manifold apparatus (Millipore) with wet filters and plug the wells with 
rubber bunds. Switch on the vacuum pump. Remove the first plug, aliquot the sample and once 
it has been sucked through, wash the filter with 10ml of ice cold Buffer E. Remove next plug 
5 etc and continue round the manifold. Take manifold apart. Pin the filters to a pin board reserved 
for [^H]. Air dry. Take up in 4ml scintillation fiuid and count. These studies have been carried 
out with a tnjncated MCG7-GST fusion protein (amino acids 341 of Figure 13a to stop encoded 
within construct B). 

10 EXAMPLE 20 

A human gene was identified fi-om chromosome llql3 that encodes a new member of the DnaJ 
family of proteins (designated MCG18). This gene (mcglS) is expressed as an -1.4kb mRNA 
(Fig. 28) and is predicted to encode a 241 amino acid product (Fig. 19). 

15 

EXAMPLE 21 

MCG18 has partial homology to E. coli dnaJ and other human DnaJ family members in that it 
contains the J domain (Fig. 20). 

20 

EXAMPLE 22 

MCG18 has greatest homology to functionally undefined proteins from C. elegans (Fig. 21) and 
S. pombe (Fig. 22) that also feature the J domain but maintain sequence similarity through the 
25 central and C-terminal regions of the proteins. 

EXAMPLE 23 

The J domain is proposed to mediate interaction with heat shock protein (Hsp70) 70 and consist 
30 of some 70 amino acids, frequendy located at the N-terminus of the protein. One of these 
proteins, tumorous imaginai discs (Tid58) from Drosophila virilis (Fig. 23) functions as a 
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tumour suppressor. 

EXAMPLE 24 

5 A comparison of homology between MCG18 and human DnaJ proteins HDJ-2/H5DJ, HDJ- 
1/HSP40 and HSJl is shown in Fig. 24. 

EXAMPLE 25 

10 During the sequence characterisation of the VRFA^EGFB promoter region on cosmid CLGW4 
[Grimmond et al 1996], which maps to chromosome 1 Iq 13 the inventors identified a sequence 
that exactly matched numerous human and mouse expressed sequence tags (ESTs) in the EST 
database from a gene which we designated meg 18. EST clones for human (GenBank accession 
number T69741, clone 108172; accession number H40901, clone 177008) and mouse meg 1 8 

15 (accession number W34884, clone 350966; accession number W64183, clone 385535) were 
obtained from Genome Systems Inc. and sequenced with the gene-specific primers shown in 
Table 7. The EST clones listed in Table 8 were also utilised in generating the full-length coding 
sequence for human (Figure 19) and mouse (Figure 25) meg 18. The EST database also 
contained megl8 cDNA entries that were alternately (or partially) spliced, and in order to 

20 understand their ability to encode new polypeptides, the gene structure of mcgl8 was determined 
by sequencing human and mouse genomic templates with gene-specific primers. 

Genontiic fragments containing the human [Grinmiond et aU 1996] and murine genes [Townson 
et al, 1996] have been previously reported. Cosmid CLGW4 contains the entire human gene 

25 and X\2\ contains the entire mouse gene, as determined by direct sequencing of the templates 
with the oligonucleotides listed in Table 7. Plasmids containing sub-fragments of A 121 and 
cosmid CLGW4 were prepared using plasmid purification kits (Qiagen) and sequenced as 
described previously [Grimmond et al, 1996; Townson et al 1996] using primers designed 
against cDNA and genomic sequences. The BLAST suite of programs [Altschul et al, 1990] 

30 was used to compare the sequence data against the nucleotide and protein databases at the 
National Center for Biotechnology Information (http//www.ncbi.nih.gov.nIm). The sequence 
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data were compiled using Mac Vector 4.2.1 software (IBI-Kodak). ClustalW sequence 
alignments [Thompson et al 1994] were conducted using the Australian National Genome 
Information Service computer faculty at the University of Sydney, Australia. 

5 The cDNA sequence of human meg 18 (Figure 19) was translated in all possible reading frames 
and compared to the GenBank non-redundant protein database using the program BLASTX 
[Altschul et al, 1990] and the coding region was identified on the basis of showing homology to 
the DnaJ family of proteins (Figure 20). The DnaJ domain is encoded within the longest open 
reading frame and the assigned initiation codon is preceded by an in-frame stop codon (Figure 

10 27). Similar database search results were obtained for the mouse mcgl8 cDNA, and the 
alignment of human and mouse protein sequences is shown in Figure 26. MCG18 has greatest 
homology to gene products from C elegans (Figure 21) and 5. pombe (Figure 22). Although 
it shares a similar J-domain, MCG18 does not contain other domains described for the tumour 
suppressor gene from Z). virilis (Figure 23), nor is it a homologue of other reported human J- 

15 domain-containing proteins (Figure 24). 

To determine the expression pattern of meg 18, IS^g of total cellular RNA (RNeasy Mini Kit, 
Qiagen) from various human cell lines grown in culture were electrophoresed through 1.2% 
MOPS/formaldehyde gels and blotted onto nylon membranes (Amersham) by capillary transfer 
20 using 20 x SSC (Sambrook et al 1986). Filters were subsequently UV-fixed and hybridised 
overnight at 65^C to a radiolabelled (^^P-dCTP) cDNA probe (Church and Gilbert, 1984) for 
mcgl8. After washes in 0. 1 x SSC/0. 1% w/v SDS for eS^'C for 1 hour, the filters were air-dried 
and exposed to X-ray film. This Northern analysis showed that meg 18 is expressed as a 1 .4kb 
message in numerous tissues including breast, ovary, bladder, lung and keratinocytes (Figure 28). 
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TABLE 4 
ESTs matching mcg4 



accession number 
gb|AA3991iO |AA399110 
gb|N39612|N39612 
gb|AA514406|AA514406 
gb|AA544946 I AA544946 
gbjAA450076|AA450076 
gb|AA535731 |AA535731 
gb|V79710|W79710 
gb|AA503531|AA503531 
gb|AA450X3 2 jAA450132 
gb|AA398068| AA398068 
gb|W60405|W60405 
gb|W81382 |W81382 
gb ( AA047617 I AA047617 
gb|AA282175|AA282175 
gb|AA242159|AA242159 
gb|AAO6868O|AA068680 
gb|W46766|W46766 
gb|N93704 |N93704 
gb|AA155210|AAl55210 
gb|AA366022 |AA366022 
gb |AA037691 |AA037691 
gb|W35374|W35374 
dbj iC00696|C00696 
gb|T98249|T98249 
gb|W21588|w21588 
gb|H3217l|H32171 
gb|AA108092 |AA108092 
gb|AA017857 |AA017857 
gb|AA037690|AA037690 
gb|AA531006 |aa531006 
gb|N46760|N46760 
gb|W23584|W23584 
gb|W42214 |W42214 
gb |AA244877 |AA244877 
gb|W32939|W32939 



seq. run organism 

zt89e06.sl Soares testis MKT Homo sa. 
yy51g06.sl Homo sapiens cDNA clone 2. 
nf57d01.sl NCI_CGAP_Co3 Homo sapiens. 
vk38e02.rl Soares mouse mammary glan. 
zx42a04,sl Soares total fetus Nb2HF8 . 
nf88f07.sl NCI_CGAP_Ca3 Homo sapiens. 
2d86f01.rl Soares fetal heart NbHH19. 
ne47e08.sl NCI_CGAP_Co3 Homo sapiens - 
2x42a04.rl Soares total fetus Nb2HF8. 
zt89f06.rl Soares testis NHT Homo sa. 
2d29h08.rl Soares fetal heart NbHH19. 
zd86f01.sl Soares fetal heart NbHH19. 
zfl3f07.sl Soares fetal heart NbHH19. 
Zt02d03.sl NCI„CGAP„GCB1 Homo sapien. 
my30d04.rl Barstead mouse pooled org. 
nnm61a05.rl Stratagene mouse erabryoni . 
2c36b07.sl Soares senescent fibrobla. 
zb51c04.sl Soares fetal lung NbHL19W. 
mr98e01.rl Stratagene mouse embryoni. 
EST76915 Pineal gland 11 Homo sapien. 
zk34hl2.sl Soares pregnant uterus Nb. . 
zc07h03.sl Soares parathyroid tumor 
HIM3S0008251. Human Gene Signature. . . 
ye59a07.sl Homo sapiens cDNA clone 1. . 
zb51c04.rl Soares fetal lung NbHL19W. . 
EST107015 Rattus sp. cDNA 5' end. 
rwn89e06-rl Stratagene mouse embryoni.. 
mh44dl0.rl Soares mouse placenta 4Nb- . 
zk34hl2.rl Soares pregnant uterus Nb. . 
nj07bll.sl hK:i_CGAP_Pr22 Homo sapien.. 
yySlgOC.rl Homo sapiens cDNA clone 2.. 
2c71d03.sl Soares fetal heart NbHH19.. 
mc69h09.rl Soares mouse embryo NbMEl. . 
mx25a04.rl Soares mouse NML Mus muse. . 
2c07h03.rl Soares parathyroid tumor 



score E value 


N 


1136 


4.0e-168 


2 


1521 


5.3€-168 


4 


931 


5.5e-166 


3 


1207 


8.4e-164 


2 


691 


2.3e-160 


4 


796 


3.5e-158 


4 


1644 


l.le-157 


4 


736 


4.0e-156 


4 


1955 


3.9e-155 


1 


1315 


5.4e*148 


2 


1022 


1.8e-139 


4 


605 


3.5e-125 


5 


922 


4 .6e-125 


2 


1577 


2 .Oe-123 


1 


866 


7.7e-117 


2 


1280 


1.6e-98 


1 


506 


9.6C-92 


3 


584 


9.0e-91 


4 


840 


7.6e-87 


2 


1077 


2.4e-81 


1 


949 


2.1e-80 


2 


1016 


3.1e-76 


1 


1009 


1.2e-75 


1 


998 


6.7e-75 


1 


484 


l.le-69 


4 


828 


l.lc-60 


1 


782 


1.3e-60 


2 


665 


2.5e-60 


2 


540 


9.4e-53 


2 


535 


5.4e-48 


2 


665 


9.5e-47 


1 


457 


1.8e-44 


2 


460 


1.3e-38 


3 


429 


2.9e-25 


1 


320 


4.8e-18 


1 
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TABLES 

ESTs matching AA074703 (mcg4'\reUted cDNA) 



Database; Non- redundant Database of GenJBank EST Division 
1,222,625 secjuences; 449.352,662 total letters. 



Smallest 
Sum 







High 


Probability 


Sequences producing High-scoring Segment Pairs: 


Score 


P(N) 


N 


accession number 


seq . run organism 


score E value 


N 


gb|AA074703 1 AA074703 


zm76g07.rl Stratagene neuroepitheli . . 


2071 


4.0e-167 


1 


gb|AAO686 8O|AA068680 


mm61a05.rl Stratagene mouse embryon. . 


1270 


4 .4e-145 


4 


gb|AA134788|AAl34788 


zm81g02.rl Stratagene neuroepitheli... 


946 


1.3e-144 


5 


gb|AA3991 10 |AA3 99110 


2t89e06.sl Soares testis NHT Homo s... 


520 


8.7e-119 


6 


gb|N39612 |N39612 


yySlgOe.sl Homo sapiens cDNA clone ... 


582 


9.6e-110 


7 


gb|AA282175 |AA282175 


2t02d03.sl NCI_CGAP_GCB1 Homo sapie. . . 


771 


9.4e-80 


3 


gb|W81382 |W81382 


2d86f01.sl Soares fetal heart NbHHl . . . 


329 


1.6e-75 


6 


gb|AA544946 |AA544946 


vk38€02.rl Soares mouse mammary gla. 


644 


9.6e-63 


2 


gb|W35374 IW35374 


.zc07h03.sl Soares parathyroid tumor... 


294 


4.5e-42 


4 


gb|W57106|W57106 


md57cl2.rl Soares mouse embryo NbME. . . 


394 


1.9e-30 


2 


gb|AA244877 1 AA244 877 


mx2 5a04.rl Soares mouse NML Mus mus. . . 


162 


2.1e-27 


4 


gb|AA017857 |AA017857 


mh44dl0.rl Soares mouse placenta 4N. . . 


230 


3.7e-23 


3 


gb|AA531OO6|AA5310O6 


nj07bll.sl NCr_CX;AP_Pr22 Homo sapie... 


139 


2.3S-19 


3 


gb|H32171|H32171 


EST107015 Rattus sp. cDNA 5' end. 


207 


2.6e-10 


2 


gb|tfvr79710|W79710 


zd86f01.rl Soares fetal heart KhHHl . . . 


157 


0,0073 


1 
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TABLE 6 
meg 7-specific oligonucleotides 



5 


name 


seoucnce f5' to 3'^ 






M1044R 


GGA CAA ACT GTG TGA TGA ACC 






MCG7-GEF-REV2 


CTC ATC CTC CGT GTG ATA CTG 


^Fo rn 




M7R 


GTA GAT GTG GAT GAG CTT GG 


SiFO rn TsJO-97 




MCG7 CA FOR 


AGG TOG AGA ATG GTC AAGG 


^Fo rn MO-oa 


10 


MCG7-GEF-REV 


GTC ATA GTC TGT CTC CTA CT 


SEQ ID NO:29 




MCG7 GEF FOR 


ACA TAG ACA GCG TGC CTA CC 


SEQ ID NO:30 




MCG7-PKC-REV 


TAC AAC CTT AGG GAC ACC AG 


SEQIDNO:31 




MCG7-PKC-FOR 


TGC TGA GCC TGC TCA CGG TG 


SEQ ID NO:32 




T09103F 


CAA GTG AAC AGC ACG TCC 


SEQ ID NO:33 


15 


M7F 


GAC TAT CTC AAG GAC C AG CTG 


SEQ ID NO:34 




MCG7UF 


GGT TCG GTC CGA GCC CGG 


SEQ ID NO:35 




SGCADRV2 


GGA GCG ATA CTC CAA GTA GGT 


SEQ ID NO:36 
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TABLE 7 

incgi^-SPECIFIC OLIGONUCLEOTIDES 





cf^r^t^f*n^^f* ^' tCk 


HVESTF 


AC^ aCiC^ C*CA rwZi^ r*r*C THT/^ i^x:r\ jt\ xi/^.a^n 
^^-"^ vjvjvj OAjrv. 1 IC^ [otv^ A-L' INO: J / J 


HV195F 


PAT PPT PtnT PP A ATn t^nr* Tr* rcl::r^ Tr\ xir^.ooi 


HV387F2 


GCA CTG AGO AAG TTA AAC GAG C [SEQ ID NO:39] 


HV408R 


GCT CGT TTA ACT TCC TCA GTG C [SEQ ID NO:40] 


EXONIREV 


GCT CAG CTC CAC AAA GCG GCT [SEQ ID NO:41] 


HVEST426F 


ACC AGC TCC GCT CAG GTA G [SEQ ID NO:42] 


HVEST623R 


TCC AGG AGC TGT GTG TTT GG [SEQ ID NO:43] 


SGVESTF3 


CCA GTT TCA CAG CGT GAG G [SEQ ID NO:44] 


HVEST631R 


CAG CAT GAG GAG GAG GCA G [SEQ ID NO:45] 
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TABLE 8 

EST CLONE SEQUENCES USED TO GENERATE HUMAN AND MOUSE 
mcgl8 cDNA SEQUENCE COMPOSITES 



EST clone number 




s^apanK accession nun^^.r 


lg28I5 


hiimnn 




0OI-T2-18 


llUIIlcU] 




273748 


llUlilal] 




177008 


human 




25801 1 


hnmon 
J lUI Ildll 


IN JU/ /o 


276887 - 


human 


N44004 


108172 


human 


T69741 


307529 


human 


W21083 and W32579 


342027 


human 


W60283 


354288 


mouse 


W44038 


350966 


mouse 


W348844 


426261 


mouse 


AA002868 


368185 


mouse 


W53911 


385535 


mouse 


W64183 


404472 


mouse 


W82959 


406437 


mouse 


W83482 
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SEQUENCE LISTING 



( 1 ) GENERAL INFORMATION: 

(i) APPLICANT: (OTHER THAN US): The Council of The Queensland Institute of 

Medical Research 

(US ONLY): HAYWARD Nicholas, SILINS Ginters, GRIMMOND Sean, 
GARTSIDE Michael and HANCOCK, John 

(ii) TITLE OF INVENTION:A NOVEL GENE AND USES THEREFOR 

(iii) NUMBER OF SEQUENCES: 45 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DA VIES COLLISON CAVE 

(B) STREET: 1 LITTLE COLLINS STREET 

(C) CITY: MELBOURNE 

(D) STATE: VICTORIA 

(E) COUNTRY: AUSTRALIA 

(F) ZIP: 3000 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT INTERNATIONAL 

(B) FILING DATE: 22-MAY-1998 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: P06973 

(B) FILING DATE: 23-MAY-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: P06974 

(B) FILING DATE: 23-MAY-1997 

(C) CLASSinCATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: P06972 

(B) FILING DATE: 23-MAY-1997 
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(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PP1459 

(B) FILING DATE: 22-JAN-1998 

(C) CLASSMCATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PP1460 

(B) FILDNfG DATE: 22-JAN-1998 

(C) CLASSmCATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PPI458 

(B) FILING DATE: 22-JAN-1998 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: HUGHES, DR E JOHN L 
(C) REFERENCE/DOCKET NUMBER: EJH/AF 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +61 3 9254 2777 

(B) TELEFAX: +61 3 9254 2770 

(C) TELEX: AA 31787 
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( 2 ) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

Cys Xaa Xaa Cys Xaa Gly Xaa Gly 

5 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1242 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 30.. 959 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

TCAGTAAACA CAGAGACTGG GGATCGATC ATG GGG CTT TGT AAG TGC CCC AAG 53 

Met Gly Leu Cys Lys Cys Pro Lys 
1 5 

AGA AAG GTG ACC AAC CTG TTC TGC TTC GAA CAT CGG GTC AAC GTC TGC 101 
Arg Lys Val Thr Asn Leu Phe Cys Phe Glu His Arg Val Asn Val Cys 
10 15 20 

GAG CAC TGC CTG GTA GCC AAT CAC GCC AAG TGC ATC GTC CAG TCC TAC 149 
Glu His Cys Leu Val Ala Asn His Ala Lys Cys lie Val Gin Ser Tyr 
25 30 35 40 

CTG CAA TGG CTC CAA GAT AGC GAC TAC AAC CCC AAT TGC CGC CTG TGC 197 
Leu Gin Trp Leu Gin Asp Ser Asp Tyr Asn Pro Asn Cys Arg Leu Cys 
45 50 55 

AAC ATA CCC CTG GCC AGC CGA GAG ACG ACC CGC CTT GTC TGC TAT GAT 24 5 

Asn lie Pro Leu Ala Ser Arg Glu Thr Thr Arg Leu Val Cys Tyr Asp 
60 65 70 

CTC TTT CAC TGG GCC TGC CTC AAT GAA CGT GCT GCC CAG CTA CCC CGA 293 
Leu Phe His Trp Ala Cys Leu Asn Glu Arg Ala Ala Gin Leu Pro Arg 
75 80 85 

AAC ACG GCA CCT GCC GGC TAT CAG TGC CCC AGC TGC AAT GGC CCC ATC 341 
Asn Thr Ala Pro Ala Gly Tyr Gin Cys Pro Ser Cys Asn Gly Pro lie 
90 95 100 

TTC CCC CCA ACC AAC CTG GCT GGC CCC GTG GCC TCC GCA CTG AGA GAG 389 
Phe Pro Pro Thr Asn Leu Ala Gly Pro Val Ala Ser Ala Leu Arg Glu 
105 110 115 120 
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AAG CTG GCC ACA GTC AAC TGG GCC CGG GCA GGA CTG GGC CTC CCT CTG 437 
Lys Leu Ala Thr Val Asn Trp Ala Arg Ala Gly Leu Gly Leu Pro Leu 
125 130 135 

ATC GAT GAG GTG GTG AGC CCA GAG CCC GAG CCC CTC AAC ACG TCT GAC 4 85 

lie Asp Glu Val Val Ser Pro Glu Pro Glu Pro Leu Asn Thr Ser Asp 
140 145 150 

TTC TCT GAC TGG TCT AGT TTT AAT GCC AGC AGT ACC CCT GGA CCA GAG 53 3 

Phe Ser Asp Trp Ser Ser Phe Asn Ala Ser Ser Thr Pro Gly Pro Glu 
155 160 165 

GAG GTA GAC AGC GCC TCT GCT GCC CCA GCC TTC TAC AGC CGA GCC CCC 581 
Glu Val Asp Ser Ala Ser Ala Ala Pro Ala Phe Tyr Ser Arg Ala Pro 
170 175 180 

CGG CCC CCA GCT TCC CCA GGC CGG CCC GAG CAG CAC ACA GTG ATC CAC 629 
Arg Pro Pro Ala Ser Pro Gly Arg Pro Glu Gin His Thr Val lie His 
185 190 195 200 

ATG GGC AAT CCT GAG CCC TTG ACT CAC GCC CCT AGG AAG GTG TAT GAT 677 
Met Gly Asn Pro Glu Pro Leu Thr His Ala Pro Arg Lys Val Tyr Asp 
205 210 215 

ACG CGG GAT GAT GAC CGG ACA CCA GGC CTC CAT GGA GAC TGT GAC GAT 72 5 

Thr Arg Asp Asp Asp Arg Thr Pro Gly Leu His Gly Asp Cys Asp Asp 
220 225 230 

GAC AAG TAC CGA CGT CGG CCG GCC TTG GGT TGG CTG GCC CGG CTG CTA 773 
Asp Lys Tyr Arg Arg Arg Pro Ala Leu Gly Trp Leu Ala Arg Leu Leu 
235 240 245 

AGG AGC CGG GCT GGG TCT CGG AAG CGG CCG CTG ACC CTG CTC CAG CGG 821 
Arg Ser Arg Ala Gly Ser Arg Lys Arg Pro Leu Thr Leu Leu Gin Arg 
250 255 260 

GCG GGG CTG CTG CTA CTC TTG GGA CTG CTG GGC TTC CTG GCC CTC CTT 869 
Ala Gly Leu Leu Leu Leu Leu Gly Leu Leu Gly Phe Leu Ala Leu Leu 
265 270 275 280 

GCC CTC ATG TCT CGC CTA GGC CGG GCC GCA GCT GAC AGC GAT CCC AAC 917 
Ala Leu Met Ser Arg Leu Gly Arg Ala Ala Ala Asp Ser Asp Pro Asn 
285 290 295 

CTG GAC CCA CTC ATG AAC CCT CAC ATC CGC GTG GGC CCC TCC TGA 962 
Leu Asp Pro Leu Met Asn Pro His lie Arg Val Gly Pro Ser * 
300 305 310 



GCCCCCTTGC 


TTGTGGCTAG 


GCCAGCCTAG 


GATGTGGGTT 


CTGTGGAGGA 


GAGGCGGGGT 


1022 


AATGGGGAGG 


CTGAGGGCAC 


CTCTTCACTG 


CCCCTCTCCC 


TCAAGCCTAA 


GACACTAAGA 


1082 


CCCCAGACCC 


AAAGCCAAGT 


CCACCAGAGT 


GGCTCGCAGG 


CCAGGCCTGG 


AGTCCCCGTG 


1142 


GGTCAAGCAT 


TTGTCTTGAC 


TTGCTTTCTC 


CCGGGTCTCC 


AGCCTCCGAC 


CCCTCGCCCC 


1202 


ATGAAGGAGC 


TGGCAGGTGG 


AAATAAACAA 


CAACTTTATT 






1242 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Gly Leu Cys Lys Cys Pro Lys Arg Lys Val Thr Asn Leu Phe Cys 
15 10 15 

Phe Glu His Arg Val Asn Val Cys Glu His Cys Leu Val Ala Asn His 
20 25 30 

Ala Lys Cys lie Val Gin Ser Tyr Leu Gin Trp Leu Gin Asp Ser Asp 
35 40 45 

Tyr Asn Pro Asn Cys Arg Leu Cys Asn lie Pro Leu Ala Ser Arg Glu 
50 55 60 

Thr Thr Arg Leu Val Cys Tyr Asp Leu Phe His Trp Ala Cys Leu Asn 
65 70 75 80 

Glu Arg Ala Ala Gin Leu Pro Arg Asn Thr Ala Pro Ala Gly Tyr Gin 
85 90 95 

Cys Pro Ser Cys Asn Gly Pro lie Phe Pro Pro Thr Asn Leu Ala Gly 
100 105 110 

Pro Val Ala Ser Ala Leu Arg Glu Lys Leu Ala Thr Val Asn Trp Ala 
115 120 125 

Arg Ala Gly Leu Gly Leu Pro Leu lie Asp Glu Val Val Ser Pro Glu 
130 135 140 

Pro Glu Pro Leu Asn Thr Ser Asp Phe Ser Asp Trp Ser Ser Phe Asn 
145 150 155 160 

Ala Ser Ser Thr Pro Gly Pro Glu Glu Val Asp Ser Ala Ser Ala Ala 
165 170 175 

Pro Ala Phe Tyr Ser Arg Ala Pro Arg Pro Pro Ala Ser Pro Gly Arg 
180 185 190 

Pro Glu Gin His Thr Val lie His Met Gly Asn Pro Glu Pro Leu Thr 
195 200 205 

His Ala Pro Arg Lys Val Tyr Asp Thr Arg Asp Asp Asp Arg Thr Pro 
210 215 220 

Gly Leu His Gly Asp Cys Asp Asp Asp Lys Tyr Arg Arg Arg Pro Ala 
225 230 235 240 

Leu Gly Trp Leu Ala Arg Leu Leu Arg Ser Arg Ala Gly Ser Arg Lys 
245 250 255 

Arg Pro Leu Thr Leu Leu Gin Arg Ala Gly Leu Leu Leu Leu Leu Gly 
260 265 270 

Leu Leu Gly Phe Leu Ala Leu Leu Ala Leu Met Ser Arg Leu Gly Arg 
275 280 285 

Ala Ala Ala Asp Ser Asp Pro Asn Leu Asp Pro Leu Met Asn Pro His 
290 295 300 

He Arg Val Gly Pro Ser 
305 310 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 



o 



wo 98/53061 PCT/AU98/00380 

-62- 



(A) LENGTH: 2415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3. .2188 



47 



95 



143 



191 



239 



287 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

CG ATT TCA TTC CTC GCT CCC CAC AGG TCC CTC TCC CCA AAA TAT TCC 
lie Ser Phe Leu Ala Pro His Arg Ser Leu Ser Pro Lys Tyr Ser 
is 10 15 

CAT CTT GTC CTA GCC CAT CCC CCA GAC TAT CTC AAG GAC CAG CTG TCC 
His Leu Val Leu Ala His Pro Pro Asp Tyr Leu Lys Asp Gin Leu Ser 
20 25 30 

CCA CGC CCC CGA CCT CCA CTA GGC CTG TGC CAC CCG CTG CCT GCA GGA 
Pro Arg Pro Arg Pro Pro Leu Gly Leu Cys His Pro Leu Pro Ala Glv 
35 40 45 ^ 

AGA CGC CCG GTC CCG GGC CGG GTT AGC CCC ATG GGA ACG CAG CGC CTG 
Arg Arg Pro Val Pro Gly Arg Val Ser Pro Met Gly Thr Gin Arg Leu 
50 55 gQ 

TGT GGC CGC GGG ACT CAA GGC TGG CCT GGC TCA AGT GAA CAG CAC GTC 
Cys Gly Arg Gly Thr Gin Gly Trp Pro Gly Ser Ser Glu Gin His Val 
65 70 75 

CAG GAG GCG ACC TCG TCC GCG GGT TTG CAT TCT GGG GTG GAC GAG CTG 
Gin Glu Ala Thr Ser Ser Ala Gly Leu His Ser Gly Val Asp Glu Leu 
^0 85 90 95 

GGG GTT CGG TCC GAG CCC GGT GGG AGG CTC CCG GAG CGC AGC CTG GGC 35 S 

Gly Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser Leu Glv 
100 105 110 

CCA GCC CAC CCC GCG CCG GCG GCC ATG GCA GGC ACC CTG GAC CTG GAC 3 8-? 

Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp Leu Asd 
115 120 125 

AAG GGC TGC ACG GTG GAG GAG CTG CTC CGC GGG TGC ATC GAA GCC TTC 451 
Lys Gly Cys Thr Val Glu Glu Leu Leu Arg Gly Cys lie Glu Ala Phe 
130 135 140 

GAT GAC TCC GGG AAG GTG CGG GAC CCG CAG CTG GTG CGC ATG TTC CTC 47 Q 

Asp Asp Ser Gly Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu 
145 150 155 

m'^v "^^^ ^"^^ ^'^^ '^^'^ CTG GCG GCC AAG CTG 527 

Met Met His Pro Trp Tyr He Pro Ser Ser Gin Leu Ala Ala Lys Leu 

165 170 175 

CTC CAC ATC TAC CAA CAA TCC CGG AAG GAC AAC TCC AAT TCC CTG CAC r-7R 
Leu His He Tyr Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin 
180 185 190 

GTG AAA ACG TGC CAC CTG GTC AGG TAC TGG ATC TCC GCC TTC CCA GCG 62^ 
Val Lys Thr Cys His Leu Val Arg Tyr Trp He Ser Ala Phe Pro Ala 
195 200 205 
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GAG TTT GAC TTG AAC CCG GAG TTG GCT GAG CAG ATC AAG GAG CTG AAG 671 
Glu Phe Asp Leu Asn Pro Glu Leu Ala Glu Gin lie Lys Glu Leu Lys 
210 215 220 

GCT CTG CTA GAC CAA GAA GGG AAC CGA CGG CAC AGC AGC CTA ATC GAC 719 
Ala Leu Leu Asp Gin Glu Gly Asn Arg Arg His Ser Ser Leu lie Asp 
225 230 235 

ATA GAC AGC GTC CCT ACC TAC AAG TGG AAG CGG CAG GTG ACT CAG CGG 7 67 

lie Asp Ser Val Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg 
240 245 250 255 

AAC CCT GTG GGA CAG AAA AAG CGC AAG ATG TCC CTG TTG TTT GAC CAC 815 
Asn Pro Val Gly Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His 
260 265 270 

CTG GAG CCC ATG GAG CTG GCG GAG CAT CTC ACC TAC TTG GAG TAT CGC 863 
Leu Glu Pro Met Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg 
275 280 285 

TCC TTC TGC AAG ATC CTG TTT CAG GAC TAT CAC AGT TTC GTG ACT CAT 911 
Ser Phe Cys Lys lie Leu Phe Gin Asp Tyr His Ser Phe Val Thr His 
290 295 300 

GGC TGC ACT GTG GAC AAC CCC GTC CTG GAG CGG TTC ATC TCC CTC TTC 959 
Gly Cys Thr Val Asp Asn Pro Val Leu Glu Arg Phe lie Ser Leu Phe 
305 310 315 

AAC AGC GTC TCA CAG TGG GTG CAG CTC ATG ATC CTC AGC AAA CCC ACA 1007 
Asn Ser Val Ser Gin Trp Val Gin Leu Met lie Leu Ser Lys Pro Thr 
320 325 330 335 

GCC CCG CAG CGG GCC CTG GTC ATC ACA CAC TTT GTC CAC GTG GCG GAG 1055 
Ala Pro Gin Arg Ala Leu Val lie Thr His Phe Val His Val Ala Glu 
340 345 350 

AAG CTG CTA CAG CTG CAG AAC TTC AAC ACG CTG ATG GCA GTG GTC GGG 1103 
Lys Leu Leu Gin Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly 
355 360 365 

GGC CTG AGC CAC AGC TCC ATC TCC CGC CTC AAG GAG ACC CAC AGC CAC 1151 
Gly Leu Ser His Ser Ser lie Ser Arg Leu Lys Glu Thr His Ser His 
370 375 380 

GTT AGC CCT GAG ACC ATC AAG CTC TGG GAG GGT CTC ACG GAA CTA GTG 1199 
Val Ser Pro Glu Thr lie Lys Leu Trp Glu Gly Leu Thr Glu Leu Val 
385 390 395 

ACG GCG ACA GGC AAC TAT GGC AAC TAC CGG CGT CGG CTG GCA GCC TGT 1247 
Thr Ala Thr Gly Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala Ala Cys 
400 405 410 415 

GTG GGC TTC CGC TTC CCG ATC CTG GGT GTG CAC CTC AAG GAC CTG GTG 12 95 

Val Gly Phe Arg Phe Pro lie Leu Gly Val His Leu Lys Asp Leu Val 
420 425 430 

GCC CTG CAG CTG GCA CTG CCT GAC TGG CTG GAC CCA GCC CGG ACC CGG 1343 
Ala Leu Gin Leu Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg 
435 440 445 

CTC AAC GGG GCC AAG ATG AAG CAG CTC TTT AGC ATC CTG GAG GAG CTG 13 91 

Leu Asn Gly Ala Lys Met Lys Gin Leu Phe Ser lie Leu Glu Glu Leu 
450 455 460 

GCC ATG GTG ACC AGC CTG CGG CCA CCA GTA CAG GCC AAC CCC GAC CTG 143 9 

Ala Met Val Thr Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu 
465 470 475 
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CTG AGC CTG CTC ACG GTG TCT CTG GAT CAG TAT CAG ACG GAG GAT GAG 1487 
Leu Ser Leu Leu Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu 
480 485 490 495 

CTG TAG CAG CTG TCC CTG CAG CGG GAG CCG CGC TCC AAG TCC TCG CCA 153 5 

Leu Tyr Gin Leu Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro 
500 505 510 

ACC AGC CCC ACG AGT TGC ACC CCA CCA CCC CGG CCC CCG GTA CTG GAG 
Thr Ser Pro Thr Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu 
515 520 525 

GAG TGG ACC TCG GCT GCC AAA CCC AAG CTG GAT CAG GCC CTC GTG GTG 1631 
Glu Trp Thr Ser Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val 
530 535 540 

GAG CAC ATC GAG AAG ATG GTG GAG TCT GTG TTC CGG AAC TTT GAC GTC 
Glu His He Glu Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val 
545 550 555 

GAT GGG GAT GGC CAC ATC TCA CAG GAA GAA TTC CAG ATC ATC CGT GGG 
Asp Gly Asp Gly His He Ser Gin Glu Glu Phe Gin He He Ara Glv 
560 565 570 575 

AAC TTC CCT TAG CTC AGC GCC TTT GGG GAC CTC GAC CAG AAC CAG GAT 
Asn Phe Pro Tyr Leu Ser Ala Phe Gly Asp Leu Asp Gin Asn Gin Asp 
580 585 590 

GGC TGC ATC AGC AGG GAG GAG ATG GTT TCC TAT TTC CTG CGC TCC AGC 
Gly Cys He Ser Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser 
595 600 605 

TCT GTG TTG GGG GGG CGC ATG GGC TTC GTA CAC AAC TTC CAG GAG AGC 1871 
Ser Val Leu Gly Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser 
610 615 620 

AAC TCC TTG CGC CCC GTC GCC TGC CGC CAC TGC AAA GCC CTG ATC CTG 
Asn Ser Leu Arg Pro Val Ala Cys Arg His Cys Lys Ala Leu He Leu 
625 630 635 

GGC ATC TAC AAG CAG GGC CTC AAA TGC CGA GCC TGT GGA GTG AAC TGC 19 67 

Gly He Tyr Lys Gin Gly Leu Lys Cys Arg Ala Cys Gly Val Asn Cvs 
640 645 650 655 

CAC AAG CAG TGC AAG GAT CGC CTG TCA GTT GAG TGT CGG CGC AGG GCC 
His Lys Gin Cys Lys Asp Arg Leu Ser Val Glu Cys Arg Arg Arg Ala 
660 665 670 



1679 



1727 



1775 



1823 



1919 



2015 



2111 



CAG AGT GTG AGC CTG GAG GGG TCT GCA CCC TCA CCC TCA CCC ATG CAC POfi-^ 
Gin Ser Val Ser Leu Glu Gly Ser Ala Pro Ser Pro Ser Pro Met His 
675 680 685 

AGC CAC CAT CAC CGC GCC TTC AGC TTC TCT CTG CCC CGC CCT GGC AGG 
Ser His His His Arg Ala Phe Ser Phe Ser Leu Pro Arg Pro Gly Arg 
690 695 700 

CGA GGC TCC AGG CCT CCA GAG ATC CGT GAG GAG GAG GTA CAG ACG GTG 2159 
Arg Gly Ser Arg Pro Pro Glu He Arg Glu Glu Glu Val Gin Thr Val 
705 710 715 

GAG GAT GGG GTG TTT GAC ATC CAC TTG TA ATAGATGCTG TGGTTGGATC 
Glu Asp Gly Val Phe Asp He His Leu 
720 725 

AAGGACTCAT TCCTGCCTTG GAGAAAATAC TTCAACCAGA GCAGGGAGCC TGGGGGTGTC 
GGGGCAGGAG GCTGGGGATG GGGGTGGGAT ATGAGGGTGG CATGCAGCTG AGGGCAGGGC 



2208 

2268 
2328 
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CAGGGCTGGT GTCCCTAAGG TTGTACAGAC TCTTGTGAAT ATTTGTATTT TCCAGATGGA 23 88 

ATAAAAAGGC CCGTGTAATT AACCTTC 2 415 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 728 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

lie Ser Phe Leu Ala Pro His Arg Ser Leu Ser Pro Lys Tyr Ser His 
15 10 15 

Leu Val Leu Ala His Pro Pro Asp Tyr Leu Lys Asp Gin Leu Ser Pro 
20 25 30 

Arg Pro Arg Pro Pro Leu Gly Leu Cys His Pro Leu Pro Ala Gly Arg 
35 40 45 

Arg Pro Val Pro Gly Arg Val Ser Pro Met Gly Thr Gin Arg Leu Cys 
50 55 60 

Gly Arg Gly Thr Gin Gly Trp Pro Gly Ser Ser Glu Gin His Val Gin 
65 70 75 80 

Glu Ala Thr Ser Ser Ala Gly Leu His Ser Gly Val Asp Glu Leu Gly 
85 90 95 

Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser Leu Gly Pro 
100 105 110 

Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp Leu Asp Lys 
115 120 125 

Gly Cys Thr Val Glu Glu Leu Leu Arg Gly Cys He Glu Ala Phe Asp 
130 135 140 

Asp Ser Gly Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu Met 
145 150 155 160 

Met His Pro Trp Tyr He Pro Ser Ser Gin Leu Ala Ala Lys Leu Leu 
165 170 175 

His He Tyr Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin Val 
180 185 190 

Lys Thr Cys His Leu Val Arg Tyr Trp He Ser Ala Phe Pro Ala Glu 
195 200 205 

Phe Asp Leu Asn Pro Glu Leu Ala Glu Gin He Lys Glu Leu Lys Ala 
210 215 220 

Leu Leu Asp Gin Glu Gly Asn Arg Arg His Ser Ser Leu He Asp He 
225 230 235 240 

Asp Ser Val Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg Asn 
245 250 255 

Pro Val Gly Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His Leu 
260 265 270 
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Glu Pro Met Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg Ser 
275 280 285 a 

Phe Cys Lys He Leu Phe Gin Asp Tyr His Ser Phe Val Thr His Gly 

295 300 

Cys Thr Val Asp Asn Pro Val Leu Glu Arg Phe He Ser Leu Phe Asn 
^ 310 

Ser Val Ser Gin Trp Val Gin Leu Met He Leu Ser Lys Pro Thr Ala 
325 330 

Pro Gin Arg Ala Leu Val He Thr His Phe Val His Val Ala Glu Lys 

Leu Leu Gin Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly Glv 
355 360 365 ^ ""^ 

Leu Ser His Ser Ser He Ser Arg Leu Lys Glu Thr His Ser His Val 

375 380 

Ser Pro Glu Thr He Lys Leu Trp Glu Gly Leu Thr Glu Leu Val Thr 

395 400 
Ala Thr Gly Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala Ala Cys Val 

Gly Phe Arg Phe Pro He Leu Gly Val His Leu Lys Asp Leu Val Ala 
^''^ 425 430 

Leu Gin Leu Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg Leu 

^•^^ 440 445 

Asn Gly Ala Lys Met Lys Gin Leu Phe Ser He Leu Glu Glu Leu Ala 

455 460 

Met val Thr Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu Leu 

475 480 

Ser Leu Leu Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu Leu 
485 490 

Tyr Gin Leu Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro Thr 

Ser Pro Thr Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu Glu 
=■^3 520 525 

Trp Thr ser Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val Glu 

540 

His He Glu Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val Asp 

555 560 

Gly Asp Gly His He Ser Gin Glu Glu Phe Gin He He Arg Gly Asn 
565 570 ^ 

Phe Pro Tyr Leu Ser Ala Phe Gly' Asp Leu Asp Gin Asn Gin Asp Gly 
580 585 590 

Cys He Ser Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser Ser 

600 605 

Val Leu Gly Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser Asn 

ol5 620 

Ser Leu Arg Pro Val Ala Cys Arg His Cys Lys Ala Leu He Leu Gly 
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625 630 635 640 

lie Tyr Lys Gin Gly Leu Lys Cys Arg Ala Cys Gly Val Asn Cys His 
645 650 655 

Lys Gin Cys Lys Asp Arg Leu Ser Val Glu Cys Arg Arg Arg Ala Gin 
660 665 670 

Ser Val Ser Leu Glu Gly Ser Ala Pro Ser Pro Ser Pro Met His Ser 
675 680 685 

His His His Arg Ala Phe Ser Phe Ser Leu Pro Arg Pro Gly Arg Arg 
690 695 700 

Gly Ser Arg Pro Pro Glu lie Arg Glu Glu Glu Val Gin Thr Val Glu 
705 710 715 720 

Asp Gly Val Phe Asp lie His Leu 
725 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 09 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 254., 2083 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CGATTTCATT CCTCGCTCCC CACAGGTCCC TCTCCCCAAA ATATTCCCAT CTTGTCCTAG 60 

CCCATCCCCC AGACTATCTC AAGGACCAGC TGTCCCCACG CCCCCGACCT CCACTAGGCC 12 0 

TGTGCCACCC GCTGCCTGCA GGAAGACGCC CGGTCCCGGG CCGGGTTAGC CCCATGGGAA 180 

CGGGGTTCGG TCCGAGCCCG GTGGGAGGCT CCCGGAGCGC AGCCTGGGCC CAGCCCACCC 240 

CGCGCCGGCG GCC ATG GCA GGC ACC CTG GAC CTG GAC AAG GGC TGC ACG 2 89 

Met Ala Gly Thr Leu Asp Leu Asp Lys Gly Cys Thr 
15 10 

GTG GAG GAG CTG CTC CGC GGG TGC ATC GAA GCC TTC GAT GAC TCC GGG 337 
Val Glu Glu Leu Leu Arg Gly Cys lie Glu Ala Phe Asp Asp Ser Gly 
15 20 25 

AAG GTG CGG GAC CCG CAG CTG GTG CGC ATG TTC CTC ATG ATG CAC CCC 3 85 

Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu Met Met His Pro 
30 35 40 

TGG TAC ATC CCC TCC TCT CAG CTG GCG GCC AAG CTG CTC CAC ATC TAC 433 
Trp Tyr lie Pro Ser Ser Gin Leu Ala Ala Lys Leu Leu His lie Tyr 
45 50 55 60 

CAA CAA TCC CGG AAG GAC AAC TCC AAT TCC CTG CAG GTG AAA ACG TGC 481 
Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin Val Lys Thr Cys 
65 70 75 

CAC CTG GTC AGG TAC TGG ATC TCC GCC TTC CCA GCG GAG TTT GAC TTG 529 
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His Leu Val Arg Tyr Trp lie Ser Ala Phe Pro Ala Glu Phe Asp Leu 
80 85 90 

AAC CCG GAG TTG GCT GAG GAG ATC AAG GAG CTG AAG GOT CTG CTA GAG 57 7 

Asn Pro Glu Leu Ala Glu Gin lie Lys Glu Leu Lys Ala Leu Leu Asp 
95 100 105 

CAA GAA GGG AAC CGA CGG CAC AGC AGC CTA ATC GAC ATA GAC AGC GTC 625 
Gin Glu Gly Asn Arg Arg His Ser Ser Leu lie Asp lie Asp Ser Val 
110 115 120 

CCT ACC TAC AAG TGG AAG CGG CAG GTG ACT GAG CGG AAC CCT GTG GGA 673 
Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg Asn Pro Val Gly 
125 130 135 140 

CAG AAA AAG CGC AAG ATG TCC CTG TTG TTT GAC CAC CTG GAG CCC ATG 721 
Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His Leu Glu Pro Met 
145 150 155 

GAG CTG GCG GAG CAT CTC ACC TAC TTG GAG TAT CGC TCC TTC TGC AAG 769 
Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg Ser Phe Cys Lys 
160 165 170 

ATC CTG TTT CAG GAC TAT CAC AGT TTC GTG ACT CAT GGC TGC ACT GTG 817 
lie Leu Phe Gin Asp Tyr His Ser Phe Val Thr His Gly Cys Thr Val 
175 180 185 

GAC AAC CCC GTC CTG GAG CGG TTC ATC TCC CTC TTC AAC AGC GTC TCA 86 5 

Asp Asn Pro Val Leu Glu Arg Phe lie Ser Leu Phe Asn Ser Val Ser 
190 195 200 

CAG TGG GTG CAG CTC ATG ATC CTC AGC AAA CCC ACA GCC CCG CAG CGG 913 
Gin Trp Val Gin Leu Met He Leu Ser Lys Pro Thr Ala Pro Gin Arg 
205 210 215 220 

GCC CTG GTC ATC ACA CAC TTT GTC CAC GTG GCG GAG AAG CTG CTA CAG 9 61 

Ala Leu Val He Thr His Phe Val His Val Ala Glu Lys Leu Leu Gin 
225 230 235 

CTG CAG AAC TTC AAC ACG CTG ATG GCA GTG GTC GGG GGC CTG AGC CAC 1009 
Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly Gly Leu Ser His 
240 245 250 

AGC TCC ATC TCC CGC CTC AAG GAG ACC CAC AGC CAC GTT AGC CCT GAG 1057 
Ser Ser He Ser Arg Leu Lys Glu Thr His Ser His Val Ser Pro Glu 
255 260 265 

ACC ATC AAG CTC TGG GAG GGT CTC ACG GAA CTA GTG ACG GCG ACA GGC 1105 
Thr He Lys Leu Trp Glu Gly Leu Thr Glu Leu Val Thr Ala Thr Gly 
270 275 280 

AAC TAT GGC AAC TAC CGG CGT CGG CTG GCA GCC TGT GTG GGC TTC CGC 1153 
Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala Ala Cys Val Gly Phe Arg 
285 290 295 300 

TTC CCG ATC CTG GGT GTG CAC CTC AAG GAC CTG GTG GCC CTG CAG CTG 12 01 

Phe Pro He Leu Gly Val His Leu Lys Asp Leu Val Ala Leu Gin Leu 
305 310 315 

GCA CTG CCT GAC TGG CTG GAC CCA GCC CGG ACC CGG CTC AAC GGG GCC 1249 
Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg Leu Asn Gly Ala 
320 325 330 

AAG ATG AAG CAG CTC TTT AGC ATC CTG GAG GAG CTG GCC ATG GTG ACC 1297 
Lys Met Lys Gin Leu Phe Ser He Leu Glu Glu Leu Ala Met Val Thr 
335 340 345 
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AGC CTG CGG CCA CCA GTA CAG GCC AAC CCC GAC CTG CTG AGC CTG CTC 13 45 

Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu Leu Ser Leu Leu 
350 355 360 

ACG GTG TCT CTG GAT CAG TAT CAG ACG GAG GAT GAG CTG TAC CAG CTG 1393 
Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu Leu Tyr Gin Leu 
365 370 375 380 

TCC CTG CAG CGG GAG CCG CGC TCC AAG TCC TCG CCA ACC AGC CCC ACG 1441 
Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro Thr Ser Pro Thr 
385 390 395 

AGT TGC ACC CCA CCA CCC CGG CCC CCG GTA CTG GAG GAG TGG ACC TCG 14 89 

Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu Glu Trp Thr Ser 
400 405 410 

GCT GCC AAA CCC AAG CTG GAT CAG GCC CTC GTG GTG GAG CAC ATC GAG 153 7 

Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val Glu His lie Glu 
415 420 425 

AAG ATG GTG GAG TCT GTG TTC CGG AAC TTT GAC GTC GAT GGG GAT GGC 1585 
Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val Asp Gly Asp Gly 
430 435 440 

CAC ATC TCA CAG GAA GAA TTC CAG ATC ATC CGT GGG AAC TTC CCT TAC 163 3 

His lie Ser Gin Glu Glu Phe Gin lie lie Arg Gly Asn Phe Pro Tyr 
445 450 455 460 

CTC AGC GCC TTT GGG GAC CTC GAC CAG AAC CAG GAT GGC TGC ATC AGC 1681 
Leu Ser Ala Phe Gly Asp Leu Asp Gin Asn Gin Asp Gly Cys lie Ser 
465 470 475 

AGG GAG GAG ATG GTT TCC TAT TTC CTG CGC TCC AGC TCT GTG TTG GGG 1729 
Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser Ser Val Leu Gly 
480 485 490 

GGG CGC ATG GGC TTC GTA CAC AAC TTC CAG GAG AGC AAC TCC TTG CGC 1777 
Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser Asn Ser Leu Arg 
495 500 505 

CCC GTC GCC TGC CGC CAC TGC AAA GCC CTG ATC CTG GGC ATC TAC AAG 1825 
Pro Val Ala Cys Arg His Cys Lys Ala Leu He Leu Gly He Tyr Lys 
510 515 520 

CAG GGC CTC AAA TGC CGA GCC TGT GGA GTG AAC TGC CAC AAG CAG TGC 1873 
Gin Gly Leu Lys Cys Arg Ala Cys Gly Val Asn Cys His Lys Gin Cys 
525 530 535 540 

AAG GAT CGC CTG TCA GTT GAG TGT CGG CGC AGG GCC CAG AGT GTG AGC 1921 
Lys Asp Arg Leu Ser Val Glu Cys Arg Arg Arg Ala Gin Ser Val Ser 
545 550 555 

CTG GAG GGG TCT GCA CCC TCA CCC TCA CCC ATG CAC AGC CAC CAT CAC 1969 
Leu Glu Gly Ser Ala Pro Ser Pro Ser Pro Met His Ser His His His 
560 565 570 

CGC GCC TTC AGC TTC TCT CTG CCC CGC CCT GGC AGG CGA GGC TCC AGG 2017 
Arg Ala Phe Ser Phe Ser Leu Pro Arg Pro Gly Arg Arg Gly Ser Arg 
575 580 585 

CCT CCA GAG ATC CGT GAG GAG GAG GTA CAG ACG GTG GAG GAT GGG GTG 2065 
Pro Pro Glu He Arg Glu Glu Glu Val Gin Thr Val Glu Asp Gly Val 
590 595 600 

TTT GAC ATC CAC TTG TAATAGATGC TGTGGTTGGA TCAAGGACTC ATTCCTGCCT 2120 
Phe Asp He His Leu 
605 610 
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TGGAGAAAAT ACTTCAACCA GAGCAGGGAG CCTGGGGGTG TCGGGGCAGG AGGCTGGGGA 2180 

TGGGGGTGGG ATATGAGGGT GGCATGCAGC TGAGGGCAGG GCCAGGGCTG GTGTCCCTAA 2240 

GGTTGTACAG ACTCTTGTGA ATATTTGTAT TTTCCAGATG GAATAAAAAG GCCCGTGTAA 23 00 

TTAACCTTC 2309 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 609 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Ala Gly Thr Leu Asp Leu Asp Lys Gly Cys Thr Val Glu Glu Leu 
15 10 15 

Leu Arg Gly Cys lie Glu Ala Phe Asp Asp Ser Gly Lys Val Arg Asp 
20 25 30 

Pro Gin Leu Val Arg Met Phe Leu Met Met His Pro Trp Tyr lie Pro 
35 40 45 

Ser Ser Gin Leu Ala Ala Lys Leu Leu His lie Tyr Gin Gin Ser Arg 
50 55 60 

Lys Asp Asn Ser Asn Ser Leu Gin Val Lys Thr Cys His Leu Val Arg 
65 70 75 80 

Tyr Trp lie Ser Ala Phe Pro Ala Glu Phe Asp Leu Asn Pro Glu Leu 
85 90 95 

Ala Glu Gin lie Lys Glu Leu Lys Ala Leu Leu Asp Gin Glu Gly Asn 
100 105 110 

Arg Arg His Ser Ser Leu lie Asp lie Asp Ser Val Pro Thr Tyr Lys 
115 120 125 

Trp Lys Arg Gin Val Thr Gin Arg Asn Pro Val Gly Gin Lys Lys Arg 
130 135 140 

Lys Met Ser Leu Leu Phe Asp His Leu Glu Pro Met Glu Leu Ala Glu 
145 150 155 160 

His Leu Thr Tyr Leu Glu Tyr Arg Ser Phe Cys Lys lie Leu Phe Gin 
165 170 175 

Asp Tyr His Ser Phe Val Thr His Gly Cys Thr Val Asp Asn Pro Val 
180 185 190 

Leu Glu Arg Phe lie Ser Leu Phe Asn Ser Val Ser Gin Trp Val Gin 
195 200 205 

Leu Met lie Leu Ser Lys Pro Thr Ala Pro Gin Arg Ala Leu Val lie 
210 215 220 

Thr His Phe Val His Val Ala Glu Lys Leu Leu Gin Leu Gin Asn Phe 
225 230 235 240 

Asn Thr Leu Met Ala Val Val Gly Gly Leu Ser His Ser Ser lie Ser 
245 250 255 



wo 98/53061 



PCT/AU98/00380 



-71 - 



Arg Leu Lys Glu Thr His Ser His Val Ser Pro Glu Thr lie Lys Leu 
260 265 270 

Trp Glu Gly Leu Thr Glu Leu Val Thr Ala Thr Gly Asn Tyr Gly Asn 
275 280 285 

Tyr Arg Arg Arg Leu Ala Ala Cys Val Gly Phe Arg Phe Pro lie Leu 
290 295 300 

Gly Val His Leu Lys Asp Leu Val Ala Leu Gin Leu Ala Leu Pro Asp 
305 310 315 320 

Trp Leu Asp Pro Ala Arg Thr Arg Leu Asn Gly Ala Lys Met Lys Gin 
325 330 335 

Leu Phe Ser lie Leu Glu Glu Leu Ala Met Val Thr Ser Leu Arg Pro 
340 345 350 

Pro Val Gin Ala Asn Pro Asp Leu Leu Ser Leu Leu Thr Val Ser Leu 
355 360 365 

Asp Gin Tyr Gin Thr Glu Asp Glu Leu Tyr Gin Leu Ser Leu Gin Arg 
370 375 380 

Glu Pro Arg Ser Lys Ser Ser Pro Thr Ser Pro Thr Ser Cys Thr Pro 
385 390 395 400 

Pro Pro Arg Pro Pro Val Leu Glu Glu Trp Thr Ser Ala Ala Lys Pro 
405 410 415 

Lys Leu Asp Gin Ala Leu Val Val Glu His lie Glu Lys Met Val Glu 
420 425 430 

Ser Val Phe Arg Asn Phe Asp Val Asp Gly Asp Gly His lie Ser Gin 
435 440 445 

Glu Glu Phe Gin lie lie Arg Gly Asn Phe Pro Tyr Leu Ser Ala Phe 
450 455 460 

Gly Asp Leu Asp Gin Asn Gin Asp Gly Cys lie Ser Arg Glu Glu Met 
465 470 475 480 

Val Ser Tyr Phe Leu Arg Ser Ser Ser Val Leu Gly Gly Arg Met Gly 
485 490 495 

Phe Val His Asn Phe Gin Glu Ser Asn Ser Leu Arg Pro Val Ala Cys 
500 505 510 

Arg His Cys Lys Ala Leu lie Leu Gly lie Tyr Lys Gin Gly Leu Lys 
515 520 525 

Cys Arg Ala Cys Gly Val Asn Cys His Lys Gin Cys Lys Asp Arg Leu 
530 535 540 

Ser Val Glu Cys Arg Arg Arg Ala Gin Ser Val Ser Leu Glu Gly Ser 
545 550 555 560 

Ala Pro Ser Pro Ser Pro Met His Ser His His His Arg Ala Phe Ser 
565 570 575 

Phe Ser Leu Pro Arg Pro Gly Arg Arg Gly Ser Arg Pro Pro Glu He 
580 585 590 

Arg Glu Glu Glu Val Gin Thr Val Glu Asp Gly Val Phe Asp He His 
595 600 605 

Leu 
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(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 11 . .733 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GCCCGCCGCC ATG CCG CCC TTA CTG CCC CTG CGC CTG TGC CGG CTG TGG 49 
Met Pro Pro Leu Leu Pro Leu Arg Leu Cys Arg Leu Trp 
15 10 

CCC CGC AAC CCT CCC TCC CGG CTC CTC GGA GCG GCC GCC GGG CAG CGG 97 
Pro Arg Asn Pro Pro Ser Arg Leu Leu Gly Ala Ala Ala Gly Gin Arg 
15 20 25 

TCC AGA CCC AGT ACT TAT TAT GAA CTG TTG GGG GTG CAT CCT GGT GCC 14 5 

Ser Arg Pro Ser Thr Tyr Tyr Glu Leu Leu Gly Val His Pro Gly Ala 
30 35 40 45 

AGO ACT GAG GAA GTT AAA CGA GCT TTC TTC TCC AAG TCC AAA GAG CTG 193 
Ser Thr Glu Glu Val Lys Arg Ala Phe Phe Ser Lys Ser Lys Glu Leu 
50 55 60 

CAC CCA GAC CGG GAC CCT GGG AAC CCA AGC CTG CAC AGC CGC TTT GTG 241 
His Pro Asp Arg Asp Pro Gly Asn Pro Ser Leu His Ser Arg Phe Val 
65 70 75 

GAG CTG AGC GAG GCA TAC CGT GTG CTC AGC CGT GAG CAG AGC CGC CGC 2 89 

Glu Leu Ser Glu Ala Tyr Arg Val Leu Ser Arg Glu Gin Ser Arg Arg 
80 85 90 

AGC TAT GAT GAC CAG CTC CGC TCA GGT AGT CCC CCA AAG TCT CCA CGA 33 7 

Ser Tyr Asp Asp Gin Leu Arg Ser Gly Ser Pro Pro Lys Ser Pro Arg 
95 100 105 

ACC ACA GTC CAT GAC AAG TCT GCC CAC CAA ACA CAC AGC TCC TGG ACA 3 85 

Thr Thr Val His Asp Lys Ser Ala His Gin Thr His Ser Ser Trp Thr 
110 115 120 125 

CCC CCC AAC GCA CAG TAC TGG TCC CAG TTT CAC AGC GTG AGG CCA CAG 43 3 

Pro Pro Asn Ala Gin Tyr Trp Ser Gin Phe His Ser Val Arg Pro Gin 
130 135 140 

GGG CCC CAG TTG AGG CAG CAG CAA CAC AAA CAA AAC AAA CAA GTG CTG 4 81 

Gly Pro Gin Leu Arg Gin Gin Gin His Lys Gin Asn Lys Gin Val Leu 
145 150 155 

GGG TAC TGC CTC CTC CTC ATG CTG GCG GGC ATG GGC CTG CAC TAC ATT 52 9 

Gly Tyr Cys Leu Leu Leu Met Leu Ala Gly Met Gly Leu His Tyr lie 
160 165 170 

GCC TTC AGG AAG GTG AAG CAG ATG CAC CTT AAC TTC ATG GAT GAA AAG 577 
Ala Phe Arg Lys Val Lys Gin Met His Leu Asn Phe Met Asp Glu Lys 
175 180 185 
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GAT CGG ATC ATC ACA GCC TTC TAG AAC GAA GCC CGG GCA CGG GCC AGG 62 5 

Asp Arg lie lie Thr Ala Phe Tyr Asn Glu Ala Arg Ala Arg Ala Arg 
190 195 200 205 

GCC AAC AGA GGC ATC CTT CAG CAG GAG CGA CAA CGG CTA GGG CAG CGG 673 
Ala Asn Arg Gly lie Leu Gin Gin Glu Arg Gin Arg Leu Gly Gin Arg 
210 215 220 

CAG CCG CCA CCA TCC GAG CCA ACC CAA GGC CCC GAG ATC GTG CCC CGG 721 
Gin Pro Pro Pro Ser Glu Pro Thr Gin Gly Pro Glu lie Val Pro Arg 
225 230 235 

GGC GCC GGC CCC TGA GGGGCTC ACCTGGATGG GGCCTGCAGT GCGTTCCCGC 773 
Gly Ala Gly Pro * 
240 

TTTGCTTCCT TCCCTGGACG GCCCGCTCCC CGAAACGCGC GCAATAAAGT GATTCGCAG 832 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Pro Pro Leu Leu Pro Leu Arg Leu Cys Arg Leu Trp Pro Arg Asn 
15 10 15 

Pro Pro Ser Arg Leu Leu Gly Ala Ala Ala Gly Gin Arg Ser Arg Pro 
20 25 30 

Ser Thr Tyr Tyr Glu Leu Leu Gly Val His Pro Gly Ala Ser Thr Glu 
35 40 45 

Glu Val Lys Arg Ala Phe Phe Ser Lys Ser Lys Glu Leu His Pro Asp 
50 55 60 

Arg Asp Pro Gly Asn Pro Ser Leu His Ser Arg Phe Val Glu Leu Ser 
65 70 75 80 

Glu Ala Tyr Arg Val Leu Ser Arg Glu Gin Ser Arg Arg Ser Tyr Asp 
85 90 95 

Asp Gin Leu Arg Ser Gly Ser Pro Pro Lys Ser Pro Arg Thr Thr Val 
100 105 110 

His Asp Lys Ser Ala His Gin Thr His Ser Ser Trp Thr Pro Pro Asn 
115 120 125 

Ala Gin Tyr Trp Ser Gin Phe His Ser Val Arg Pro Gin Gly Pro Gin 
130 135 140 

Leu Arg Gin Gin Gin His Lys Gin Asn Lys Gin Val Leu Gly Tyr Cys 
145 150 155 160 

Leu Leu Leu Met Leu Ala Gly Met Gly Leu His Tyr lie Ala Phe Arg 
165 170 175 

Lys Val Lys Gin Met His Leu Asn Phe Met Asp Glu Lys Asp Arg lie 
180 185 190 

lie Thr Ala Phe Tyr Asn Glu Ala Arg Ala Arg Ala Arg Ala Asn Arg 
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195 200 205 



Gly He Leu Gin Gin Glu Arg Gin Arg Leu Gly Gin Arg Gin Pro Pro 
210 215 220 

Pro Ser Glu Pro Thr Gin Gly Pro Glu He Val Pro Arg Gly Ala Glv 
225 230 235 240 

Pro 



SEQ ID Nos: 10-18 25-36 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 170.. 300 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

CGATTTCATT CCTCGCTCCC CACAGGTCCC TCTCCCCAAA ATATTCCCAT CTTGTCCTAG 60 

CCCATCCCCC AGACTATCTC AAGGACCAGC TGTCCCCACG CCCCCGACCT CCACTAGGCC 120 

TGTGCCACCC GCTGCCTGCA GGAAGACGCC CGGTCCCGGG CCGGGTTAG CCC CAT 17 5 

Pro His 
1 

GGG AAC GGG GTT CGG TCC GAG CCC GGT GGG AGG CTC CCG GAG CGC AGC 22 3 

Gly Asn Gly Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser 
5 10 15 

CTG GGC CCA GCC CAC CCC GCG CCG GCG GCC ATG GCA GGC ACC CTG GAC 271 
Leu Gly Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp 
20 25 30 



CTG GAC AAG GGC TGC ACG GTG GAG GAG CT 
Leu Asp Lys Gly Cys Thr Val Glu Glu Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO ; 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 



300 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



Pro His Gly Asn 
1 



Gly Val 
5 



Arg Ser Glu 



Pro Gly Gly Arg Leu Pro Glu 
10 15 



Arg Ser Leu Gly 
20 



Pro Ala 



His Pro Ala 
25 



Pro Ala Ala Met Ala Gly Thr 
30 



Leu Asp Leu Asp 
35 



Lys Gly 



Cys Thr Val 
40 



Glu Glu Leu 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
GGGATCCCCC TGGTC 15 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Asp Val Asp Glu Glu Asp Glu Val Glu Asp lie Glu Phe 
15 10 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Asp Val Asp Gly Asp Gly His lie Ser Gin Glu Glu Phe 
15 10 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Asp His Asp Arg Asp Gly Phe lie Ser Gin Glu Glu Phe 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Asp Gin Asn Gin Asp Gly Cys lie Ser Arg Glu Glu Met 
15 10 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Asp Val Asp Met Asp Gly Gin He Ser Lys Asp Glu Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

His Phe Val His Val Ala Glu Lys Leu Leu Gin Leu Gin Asn Phe Asn 
15 10 15 

Thr Leu Met Ala Val Val Gly Gly Leu Ser His Ser Ser He Ser Arg 
20 25 30 
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Leu Lys Glu Thr His 
35 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Lys Phe Val His Val Ala Lys His Leu Arg Lys lie Asn Asn Phe Asn 
1 5 10 15 

Thr Leu Met Ser Val Val Gly Gly lie Thr His Ser Ser Val Ala Arg 
20 25 30 

Leu Ala Lys Thr Tyr 
35 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

His Asn Phe Gin Glu Ser Asn Ser Leu Arg Pro Val Ala Cys Arg His 
15 10 15 

Cys Lys Ala Leu lie Leu Gly lie Tyr Lys Gin Gly Leu Lys Cys Arg 
20 25 30 

Ala Cys Gly Val Asn Cys His Lys Gin Cys Lys Asp Arg Leu Ser Val 
35 40 45 

Glu Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 18: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

His Asn Phe His Glu Thr Thr Phe Leu Thr Pro Thr Thr Cys Asn His 



o o 



wo 98/53061 PCT/AU98/00380 

-78- 



15 10 15 

Cys Asn Lys Leu Leu Trp Gly lie Leu Arg Gin Gly Phe Lys Cys Lys 
20 25 30 

Asp Cys Gly Leu Ala Val His Ser Cys Cys Lys Ser Asn Ala Val Ala 
35 40 45 

Glu Cys 
50 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GGGATCCCCC TGGTC 15 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GAATTCGGCA CGAGCCGACG G 21 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATGGAGCAGA AGCTGATCTC CGAGGAGGAC CTGCCCGGGG CAGCTGGATC CGCAGCCCAC 60 
CCCGCGCCGG CGGCCATG 78 
(2) INFORMATION FOR SEQ ID NO: 22: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Pro Gly Ala Ala Gly 
15 10 15 

Ser Ala Ala His Pro Ala Pro Ala Ala Met 
20 25 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GGATCCGCAG CCCACCCCGC GCCGGCGGCC ATG 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



Gly Ser Ala Ala His Pro Ala Pro Ala Ala Met 

5 10 

(2) INFORMATION FOR SEQ ID NO : 2 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
GGACAAAGTG TGTGATGAAC C 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 
CTCATCCTCC GTCTGATACT G 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
GTAGATGTGG ATCAGCTTGG 

(2) INFORMATION FOR SEQ ID NO: 28: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
AGGTGGAGAA TGGTCAAGG 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 29 
GTCATAGTCT GTCTCCTACT 
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(2) INFORMATION FOR SEQ ID NO : 3 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 
ACATAGACAG CGTGCCTACC 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 
TACAACCTTA GGGACACCAG 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
TGCTGAGCCT GCTCACGGTG 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 
CAAGTGAACA GCACGTCC 

(2) INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GACTATCTCA AGGACCAGCT G 21 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGTTCGGTCC GAGCCCGG 18 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGAGCGATAC TCCAAGTAGG T 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
AGCGGGCCAG GCCCCTTC 



(2) INFORMATION FOR SEQ ID NO: 38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 8 
CATCCTGGTC CAATGCGCTC 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
GCACTGAGGA AGTTAAACGA GC 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
GCTCGTTTAA CTTCCTCAGT GC 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 



GCTCAGCTCC ACAAAGCGGC T 
(2) INFORMATION FOR SEQ ID NO: 42: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



ACCAGCTCCG CTCAGGTAG 19 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 



TCCAGGAGCT GTGTGTTTGG 2 0 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE; DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 



CCAGTTTCAC AGCGTGAGG 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



CAGCATGAGG AGGAGGCAG 



19 
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CLAIMS: 

1 . An isolated nucleic acid molecule comprising a sequence of nucleotides encoding or 
conq^lementary to a sequence encoding an amino acid sequence having homology to a regulator 
of gene expression or a derivative of said gene regulator. 

2. An isolated nucleic acid molecule according to claim 1 wherein the regulator 
comprises a zinc fmger domain of an (HC3)2 type. 

3. An isolated nucleic acid molecule according to claim 2 wherein the sequence of 
nucleotides or complementary sequence of nucleotides is selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:2; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 3; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 

4. An isolated nucleic acid molecule according to claim 1 wherein said gene regulator is 
a guanine nucleotide exchange factor (GEF) or a derivative thereof. 

5. An isolated nucleic acid molecule according to claim 4 wherein the sequence of 
nucleotides is selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:4 or 6; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:5 or 
7; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
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nucleotide sequence set forth in (i), (ii) or (iii). 

6. An isolated nucleic acid molecule according to claim 1, wherein said gene regulator 
is a heat shock protein or is a heat shock binding protein or a derivative thereof. 

7. An isolated nucleic acid molecule according to claim 6, wherein the sequence of 
nucleotides is selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:8; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:9; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 

8. A genetic construct comprising a vector portion and a gene portion comprising a 
regulator of gene expression or a derivative thereof . 

9. A genetic construct according to claim 8 wherein the gene portion comprises a zinc 
finger domain of (HC3)2 type. 

10. A genetic construct according to claim 9 wherein the gene portion comprises a 
nucleotide sequence selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:2; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:3; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 
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11. A genetic construct according to claim 8 wherein said gene portion is a nucleotide 
exchange factor (GEF) or derivative thereof. 

12. A genetic construct according to claim 11 wherein the gene portion comprises a 
nucleotide sequence selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:4 or 6; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO: 5 or 

7; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 

13. A genetic construct according to claim 8 wherein the gene portion is a heat shock 
protein or a derivative thereof or a heat shock binding protein or derivative thereof. 

14. A genetic construct according to claim 13 wherein the gene portion comprises a 
nucleotide sequence selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO: 8; 

(ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:9; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridising under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 

15. A nucleic acid molecule encoding a gene regulator having the identifying 
characteristics of a molecule selected from MCG4, MCG7 and MCG18 having respective amino 
acid sequences of SEQ ID NO:3, SEQ ID NO: 5 or 7 and SEQ ED NO:9. 
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16. A method of detecting a condition caused or facilitated by an aberration in mcg4, said 
method comprising determining the presence of a single or multiple nucleotide substitution, 
deletion and/or addition or other aberration to one or both alleles of said mcg4 wherein the 
presence of such a nucleotide substitution, deletion and/or addition or other aberration may be 
indicative of said condition or a propensity to develop said condition. 

17. A method of detecting a condition caused or facilitated by an aberration in mcg4, said 
method comprising screening for a single or multiple amino acid substitution, deletion and/or 
addition to MCG4 wherein the presence of such a mutation is indicative of or a propensity to 
develop said condition. 

18. A method for detecting MCG4 or a derivative thereof in a biological sample said 
method comprising contacting said biological sample with an antibody specific for MCG4 or its 
derivatives or homologues for a time and under conditions sufficient for an antibody-MCG4 
complex to form, and then detecting said complex. 

19. A method of detecting a condition caused or facilitated by an aberration in mcg7, said 
method comprising determining the presence of a single or multiple nucleotide substitution, 
deletion and/or addition or other aberration to one or both alleles of said mcg7 wherein the 
presence of such a nucleotide substitution, deletion and/or addition or other aberration may be 
indicative of said condition or a propensity to develop said condition. 

20. A method of detecting a condition caused or facilitated by an aberration in mcg7, said 
method comprising screening for a single or multiple amino acid substitution, deletion and/or 
addition to MCG7 wherein the presence of such a mutation is indicative of or a propensity to 
develop said condition. 

21. A method for detecting MCG7 or a derivative thereof in a biological sample said 
method conprising contacting said biological sample with an antibody specific for MCG7 or its 
derivatives or homologues for a time and under conditions sufficient for an antibody-MCG7 
complex to form, and then detecting said complex. 
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22. A method of detecting a condition caused or facilitated by an aberration in mcgI8, said 
method comprising detemiining the presence of a single or multiple nucleotide substitution, 
deletion and/or addition or other aberration to one or both alleles of said mcgI8 wherein the 
presence of such a nucleotide substitution, deletion and/or addition or other aberration may be 
indicative of said condition or a propensity to develop said condition. 

23. A method of detecting a condition caused or facilitated by an aberration in meg 1 8, said 
method comprising screening for a single or multiple amino acid substitution, deletion and/or 
addition to MCG18 wherein the presence of such a mutation is indicative of or a propensity to 
develop said condition. 

24. A method for detecting MCG18 or a derivative thereof in a biological sample said 
method comprising contacting said biological sample with an antibody specific for MCG18 or 
its derivatives or homologues for a time and under conditions sufficient for an antibody-MCG18 
complex to form, and then detecting said complex. . 
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FIGURE 18 (Cont. I) 

Smal/Apal (both lost) 0.00 

r 




ma I (both lost) 1.00 



Plasmid name: clone 16 in pGEX-3X 
Plasmid size:6.00 kb 



FIGURE 18 (Cont. II) 

EcoRI 0.00 




Plasmid name: clone 19 in pGEX-1 
Plasmid size: 6.00 Kb 
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FIGURE 18 (Cont. Mi) 

(BamHI) 




Hindlll 2.50 

Plasmid name: clone 5 in pGEM-llzf 
Plasmid size: 5.50 kb 



(BamHi) 0.00 




Plasnnid name: clone 27 in pGEX-2T 
Plasmid size: 7.50 kb 

FIGURE 18 (Cont. IV) 
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FIG 19 (I) 
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International Application No. 
PCT/AU 98/00380 

A. CLASSIFICATION OF SUBJECT MATTER 

Int CI*'- C12N 15/12; C07K 14/47; C07K 16/18; fiOlN 33/53 

According to International Patent Classification (IPC) or to both national classification and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 
I/C: WPAT (D gene) Sequences provided by Applicant 



Docimientation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 
:EMBL, Genebank, Swiss Prot and PIR: Sequences provided by applicant 



INTERNATIONAL SEARCH REPORT 



c. 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



Citation of doctunent, with indication, where appropriate, of the relevant passages 



Relevant to claim No. 



P,X 



P,X 



Kedra D, Seroussi E, Fransson I, Trifunovic J, Clark M, Lagercranz J, 
Blennow E, Mehlin H, Diunanski J, Human Genetics, October 1997 100(5-6) 
611-619 The germinal centre kinase gene and a novel CDC25-like gene are 
located in the vicinity of the PYGM gene on 1 lql3 
EMBL AC Y12339 

Guru S C, Agarwal S K, Manickain P, Olufemi S E, et al Genome Research, 
July 1997 7(7) 725-735. A transcript map for the 2.8-Mb region containing 
the multiple endocrine neoplasia type I locus 

TREMBL AC 014616 " 



1-3,8-10,15-18 



1.4-5,8, 11-12, 15, 
19-21 



X 



Further documents are listed in the 
contmuation of Box C 



I I See patent family annex 



* special categories of cited documents: 

"A" document defining the general state of the art which is 
not considered to be of particular relevance 

"E" earlier document but published on or after the 
international filing date 

"L" document which may throw doubts on priority claim(s) 
or which is cited to establish the publication date of 
another citation or other special reason (as specified) 

"O" document referring to an oral disclosure, use, 
exhibition or other means 

"P" document published prior to the international filing 
date but later than the priority date claimed 


"T" later document published after the international filing date or 

priority date and not in conflict with the application but cited to 
understand the principle or theory underlying the invention 

"X" document of particular relevance; the claimed invention cannot 
be considered novel or cannot be considered to involve an 
inventive step when the document is taken alone 

" Y" document of particular relevance; the claimed invention caimot 
be considered to involve an inventive step when the document is 
combined with one or more other such documents^ such 
combination being obvious to a person skilled in the art 
document member of the same patent family 


Date of the actual completion of the international search 
16 July 1998 


Date of mailing of the international search report 

2 0 JUL 1998 


Name and maiiine address of the ISA/AU 
AUSTEULIAN PATENT OFHCE 
PO BOX 200 
WODEN ACT 2606 
AUSTRALIA 

Facsimile No.: C02> 6285 3929 


Authorized officer 

GILLIAN ALLEN , ( --Vl \ x / •. - ^ 
Telephone No.: (02) 6283 2266 



Form PCT/ISA/210 (second sheet) (July 1992) copbko 



INTERNATIONAL SEARCH REPORT 



international Application No. 
PCT/AU 98/00380 



Box 1 Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following 
reasons: 

[ I Claims Nos.: 

because they relate to subject matter not required to be searched by this Authority, namely: 



X Claims Nos.: 1,2,4,6 

because they relate to parts of the international application that do not comply with the prescribed requirements 
to such an extent that no meaningful international search can be carried out, specifically: 

They are to known groups of proteins and lack distinguishing features which would enable a meaningful 
search. 



I I Claims Nos. : 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 
6.4(a) 



Box n Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 

Invention L defined by claims 2, 3, 9, 10, 16-18, is to nucleotide sequences, amino acid sequences and proteins with a 
zinc finger domain. 

Invention 2, defined by claims 4, 5, 1 1, 12, 19-21, is to nucleotide sequences and amino acid sequences and proteins 
which are guanine exchange factors. 

Invention 3, defined by claims 6, 7, 13, 14, 22-24, is to nucleotide sequences and amino acid sequences and proteins 
which are heat shock proteins or heat shock binding proteins. 

[xl As all required additional search fees were timely paid by the applicant, this international search report covers 
all searchable claims 

2. I I As all searchable claims could be searched without effort justifying an additional fee, this Authority did not 
imite payment of any additional fee. 

3 . I I As only some of the required additional search fees were timely paid by the applicant, this international search 
repon covers only those claims for which fees were paid, specifically claims Nos.: 



4. 



I I No required additional search fees were timely paid by the applicant. Consequently, this international search 
— report is restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 



Remark on Protest j | The additional search fees were accompanied by the applicant's protest. 

jx I No protest accompanied the payment of additional search fees. 
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C ^Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category* 


Citation of document, with indication^ where appropriate, of the relevant passages 


Relevant to 
claim No. 




EMBL AC AF012106 
DT 6 November 1997 
Lloyd S E and Thakker R V DE 

Homo Sapiens DnaJ protein (HSPF2)mRNA, complete cds 




1,6-8,13- 
15,22-24 


P,X 


EMBL AC AF 036875 
DT20 May 1998 

Silins G, Grimmond S, Hayward N DE 

Mus musculus multiple endocrine neoplasia type 1 candidate protein number 1 8 mRNA, 
complete cds 


1,6-8,13- 
15,22-24 



Foim PCT/ISA/210 (continuation of second sheet) (July 1992) copbko 



4 



1 



* P 



^^ATENT COOPERATION TREaI^ 

PCX 

INTERNATIONAL PRELIMINARY EXAMINATION RE 

(PCT Article 36 and Rule 70) 



Applicant's or agent's file reference 
204908 1/EJH 



FOR FURTHER 
ACTION 




See Notification of Transmittal of International Preliminary 
Examination Report (Form PCT/IPEA/416). 



International application No. 
PCT/AU 98/00380 



International filing date 

(day /month/year) 

11 May 1998 



Priority Date (day/month/year) 
23 May 1997 



International Patent Classification (IPC) or national classification and IPC 
Int. CI/ C12N 15/12; C07K 14/47 



Applicant 



AMRAD Operations Pty Ltd 



This international preliminary examination report has been prepared by this International Preliminary Examining 
Authority and is transmitted to the applicant according to Article 36. 

This REPORT consists of a total of 5 sheets, including this cover sheet. 

I I This report is also accompanied by ANNEXES, i.e., sheets of the description, claims and/or drawings which have 
been amended and are the basis for this report and/or sheets containing rectifications made before this Authority 
(see Rule 70.16 and Section 607 of the Administrative Instructions under the PCT). 

These annexes consist of a total of sheet(s). 



3. This report contains indications relating to the following items: 



I 


m 


n 


□ 


m 


□ 


IV 




V 




VI 


□ 


VII 


□ 


VIII 


[xl 



citations and explanations supporting such statement 



Date of submission of the demand 
17 December 1998 


Date of completion of the report 
6 July 1998 


Name and mailing address of the IPEA/AU 

AUSTRALIAN PATENT OFFICE 

PO BOX 200 

WODEN ACT 2606 

AUSTRALIA 

Facsimile No. (02) 6285 3929 


Authorized Officer 

GILLIAN ALLEN 

Telephone No. (02) 6283 2266 



Form PCT/IPEA/409 (Cover sheet) (July 1998) 
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INTERNATIONAL PREL ARY EXAMINATION REPORT ^^Jluemational application No. 

PCT/AU 98/00380 



Basis of the report 



1. With regard to the elements of the international application:* 

fxl the international application as originally filed. 

I 1 the description, pages , as originally filed, 

pages , filed with the demand, 
pages , filed with the letter of . 

I I the claims, pages , as originally filed, 

pages , as amended (together with any statement) under Article 19, 

pages , filed with the demand, 

pages , filed with the letter of • 

I I the drawings, pages , as originally filed, 

pages , filed with the demand, 
pages , filed with the letter of . 

I I the sequence listing part of the description: 

pages , as originally filed 
pages , filed with the demand 
pages , filed with the letter of 

2. With regard to the language, all the elements marked above were available or fiimished to this Authority in the language in 
which the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language which is: 

I I the language of a translation furnished for the purposes of international search (under Rule 23 .1(b)). 
I I the language of publication of the international application (under Rule 48.3(b)). 

I I the language of the translation furnished for the purposes of international preUminaiy examination (imder Rules 55.2 
and/or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, was on the basis of 
the sequence listing: 

I I contained in the international application in written form. 

[ I filed together with the international application in computer readable form. 

I I fiimished subsequently to this Authority in written form. 

I I fiunished subsequently to this Authority in computer readable form. 

I I The statement that the subsequently fiimished written sequence listing does not go beyond the disclosure in the 

international application as filed has been furnished. 
[ I The statement that the information recorded in computer readable form is identical to the written sequence listing has 

been furnished 

4. The amendments have resulted in the cancellation of: 

I I the description, pages 

I I the claims, Nos, 

I I the drawings, sheets/fig 

5. This report has been established as if (some of) the amendments had not been made, since they have been considered 
to go beyond the disclosure as filed, as indicated in the Supplemental Box (Rule 70,2(c)).** 

* Replacement sheets which have been furnished to the receiving OJJice in response to an invitation under Article 14 are referred to in this 

report as "originally filed" and are not annexed to this report since they do not contain amendments (Rules 70. 16 and 70.17). 
** Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this report 
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PCT/AU 98/00380 



rV. Lack of unity of invention 



In response to the invitation to restrict or pay additional fees the applicant has: 

I I restricted the claims. 

I I paid additional fees. 

I I paid additional fees under protest. 

I I neither restricted nor paid additional fees. 



X I This Authority found that the requirement of imity of invention is not complied with and chose, according to Rule 
68. 1, not to invite the applicant to restrict or pay additional fees. 

3. This Authority considers that the requirement of unity of invention in accordance with Rules 13.1, 13.2 and 13.3 is 

I I complied with, 

I I not complied with for the following reasons: 
The Application is to three separate proteins or groups of proteins. 

Claims 2, 3, 9, 10, 16-18 are to zinc finger proteins or their encoding nucleic acid sequences. 

Claims 4, 5, 11, 13,19-21 are to guanine exchange factor proteins or their encoding nucleic acid sequences. 

Claims 6, 7, 13, 14, 22-24 are to heat shock or heat shock binding proteins. 

There is no sequence homology between the three protein types 

The only unifying feature is their function as gene regulatory proteins. However, gene regulatory proteins of all three 
types are known. It is also known that gene regulatory proteins can exert their effects via a variety of mechanisms. 
Therefore, this feature does not provide unity according to Rule 13.2 of the PCT 



Consequently, the following parts of the international application were the subject of international preliminary 
examination in establishing this report: 

[x\ all parts. 

I I the parts relating to claims Nos. 
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ARY EXAMINATION REPORT 



International application No. 
PCT/AU 98/00380 



Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 



1. Statement 

Novelty (N) 

Inventive step (IS) 



Claims 3, 5, 7, 10, 12. 14-24 
Claims 1, 2, 4, 6, 8, 9, 11, 13 

Claims 3, 5, 7, 10, 12. 14-24 
Claims 1, 2, 4, 6, 8, 9, 11, 13 



YES 
NO 

YES 
NO 



Industrial applicability (I A) Claims 1-24 

Claims 



YES 
NO 



2. Citations and explanations (Rule 70.7) 

Citations 

There was no close prior art found in the International Search to the claims searched. However, claims 1, 2, 4, and 6 were 
not searched 



Noveltv and Inventive Step 

Claims 1, 2, 4, 6, 8, 9, 11, 13 lack novelty and inventive step over the common general knowledge of the art and the 
disclosures of the description. 

Claims 1 and 8 are to any protein regulator of gene expression. It is well known to anyone skilled in the art that gene 
expression can be regulated by proteins, and very many such proteins are known and have been sequenced. Therefore 
these claims lack novelty over the common general knowledge of the art. 

Claims 2 and 9 are to (HC3)2type zinc finger proteins. Zinc fingers are well known structural domains or motifs which 
bind to DNA. Figs 4-6 of the description disclose the structure of the zinc finger motif and zinc finger proteins, 
homolgous to MCG4, firom C. elegans and Saccharomyces pombe. Therefor claims 1, 2, 8 and 9 lack novelty over these 
disclosures 

Claims 4 and 1 1 are to guanine exchange fectors (GEFs). This is a well known group of proteins which include the Ras 
oncogene. 

Fig 12 discloses a number of known GEFs. Thus clains 1, 4, 8 and 1 1 lack novelty over these disclosures. 

Claims 6 and 13 are to heat shock or heatshock binding proteins. Once again these are an extremely well known group of 
proteins. Figures 20 and 24 disclose DnaJ proteins, a type of heat shock protein. Thus claims 1, 6, 8 and 13 are not novel 
over these disclosures. 



There is no close prior art which discloses the nucleic or amino acid sequences of the proteins designated MCG4, MCG7 
or MCG18. Therefore claims 3, 5, 7, 10, 12 and 14-24 are considered novel and inventive. 
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4 PCT/AU 98/00380 

VLLL Certain observations on the international application 



The following observations on the clarity of the claims, description, and drawings or on the question whether the claims are fiilly 
supported by the description, are made: 

Claims 3,5,7, 10, 12 and 14, parts (iii) and (iv) of each, are to nucleic acid sequences with at least 40% homology to, or 
which hybridise at low stringency with the nucleic acid sequences specified in the claims. It is considered that the 
description does not fully support claims to sequences having such low homology to those disclosed in the description. 
Nor does it support nucleic acid sequences which encode proteins of different functions to MCG4, MCG7 or MCG18, or 
which do not encode any polypeptide. 

Claims 3, 5, 7, 10, 12 and 14, part (iv) are not clear. They place no limit on the length of the hybridising sequences, so 
the scope of the claims is indeterminate. 
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